thaitextaug.word2vec¶
Modules¶
- class thaitextaug.word2vec.Word2VecAug(model: str, tokenize: object, type: str = 'file')¶
- augment(sentence: str, n_sent: int = 1, p: float = 0.7) → List[Tuple[str]]¶
- Parameters
sentence (str) – text sentence
n_sent (int) – max number for synonyms sentence
p (int) – probability
- Returns
list of synonyms
- Return type
List[Tuple[str]]
- modify_sent(sent, p=0.7) → List[List[str]]¶
- Parameters
sent (str) – text sentence
p (float) – probability
- Return type
List[List[str]]
- class thaitextaug.word2vec.BPEmbAug(lang: str = 'th', vs: int = 100000, dim: int = 300)¶
Thai Text Augment using word2vec from BPEmb
BPEmb: github.com/bheinzerling/bpemb
- augment(sentence: str, n_sent: int = 1, p: float = 0.7) → List[Tuple[str]]¶
Text Augment using word2vec from BPEmb
- Parameters
sentence (str) – thai sentence
n_sent (int) – number sentence
p (float) – Probability of word
- Returns
list of synonyms
- Return type
List[Tuple[str]]
- load_w2v()¶
Load BPEmb model
- tokenizer(text: str) → List[str]¶
- Parameters
text (str) – thai text
- Return type
List[str]
- class thaitextaug.word2vec.Thai2fitAug¶
Text Augment using word2vec from Thai2Fit
Thai2Fit: github.com/cstorm125/thai2fit
- augment(sentence: str, n_sent: int = 1, p: float = 0.7) → List[Tuple[str]]¶
Text Augment using word2vec from Thai2Fit
- Parameters
sentence (str) – thai sentence
n_sent (int) – number sentence
p (float) – Probability of word
- Returns
list of text augment
- Return type
List[Tuple[str]]
- load_w2v()¶
Load thai2fit word2vec model
- tokenizer(text: str) → List[str]¶
- Parameters
text (str) – thai text
- Return type
List[str]