Prepare text as tree or list
Tree
The
jsngram.dir2 module
gives an easy list of files in a directory tree.
ix = jsngram.jsngram.JsNgram() # adding from directory "src" for file in jsngram.dir2.list_files(src): ix.add_file(file)
List
Explicit list of files can also be used instead of a file tree. List of id-content-pairs can be used instead of files.
ix = jsngram.jsngram.JsNgram() # adding from file list "files" for file in files: ix.add_file(file) # adding (id, content) pairs "contents" for id, content in contents: ix.add_document(id, content)