You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Initialize a git repository at the folder of your second brain
conda install gitpython
Inside utils.py-> add the path of your second brain at second_brain_path
How to use
To index your data first -> python easiest_search.py --indexed False --search False
To search your data -> `python easiest_search.py --indexed True --search True``
How does it work?
First we get a list that has been updated in our second brain since the last time we indexed our second brain, here we differentiate between new and modified files. For modified files we delete the old file from the index and add the new file by calling the remove_old_note function through our flow
In the add_highlight method, we divide the note into "title" and "body". The title is the name of the file and the body starts after the first heading (we usually have a note that starts with '# <Title>')
The get_highlight_with_embedded_notes creates a jina document type which has as text the title of the note, and it would have multiple "chunks" representing the content of the note. Every sentence is considered its own "chunk". Later, when we are searching in our second brain we search against these sentences for a match
After files have been indexed, they are comitted in the repo we established in the second brain folder
For each sentence we create an embedding (vector representation) in 384 dimensions using the method encode_sentencesthis takes place on the jina cloud to speed up the process
Your indexed database lives on your machine under workspace only the embedding would happen on the cloud but all your information lives locally on your own machine
When you are searching, your query gets turned into a vector, and matched to the nearest neighbout under the function search