- Clone the template project where you want
git clone https://XXX - Rename the project (template_projet) with the same project name as before (
mv template_projet <my_project>) - Go into the project folder:
cd my_project - Delete existing git tracking:
rm -rf .git
- Init a new git tracking:
git init --initial-branch=main - In gitlab interface, create a project (New project), choose 'Create blank project' and fill the blanks (Internal visibility level is OK, uncheck " Initialize repository with a README". In "Pick a group or namespace" it's better to choose a group if you work with a team, if you work on your own project choose your pnom). Choose a project name (let's call it my_project here)
- Add the remote you create in the first place (at step 1):
git remote add origin https://XXX(please get the right url from the gitlab interface in the clone buton, if you are on AWS and you get an error with this url please use https://gitlab-datalab.quinten-saas.com/ instead:git remote add origin https://[XXX]....git) - Add all the files:
git add .(you can check what you are about to add by runninggit statusjust before, which is recommended) - Commit the changes:
git commit -m "initial commit" - Puhs (after reading warning):
git push origin master. /!\Warning/!: distant repository might have "main" branch instead of "master" (check on gitlab). If you want to have the same name locally rename you local "master" branch to "main" before pushing:git branch -m master main - You're done, congrats! Now follow the next todo
-
Rename your source code folder with the name of your project (current is "src")
-
Add notebooks/* and logs/* to .gitignore file
-
If you do not use docker (e.g. in aws env), create a python environment for your project. You don't have to put this environment inside your project folder. If you do so your should add the folder to .gitignore file. Otherwise you can have this folder outside the project folder in a dedicated folder for instance.
-
Run
pip install --upgrade pipthenpip install -r requirements -
You can run
python main.pyon terminal to see if you have "hello world" printed :)
Albus coding framework is detailed here https://confluence.par.quinten.io/pages/viewpage.action?pageId=64686373 please read carefully. On top of that (and some reminders ;)) you also need to pay attention on those items:
- Use modular coding practice: organise your code using modules, functions and class. Think your code in terms of pipeline "what's in what's out"
- Comment your code and document all your functions :
- Unit test your code the most or at least the most critical function
- Create new branch on git each time you develop new feature
- Find a partner to review your code. Should be at least 2 hours (straight or 2*1h) a week on scheduled timeshifts
- Follow the google coding style for harmonized practices at Quinten: https://google.github.io/styleguide/pyguide.html
- Use black (https://github.com/psf/black) to format your code automatically before commit:
black src - You should use notebooks for exploratory analysis and testing part of your code only. If you don't have other choice than using notebook for your project please remove notebooks folder from .gitignore file
- All your imports should start by the project source code folder's name:
from src.XXXX import XXXX
or
from src import XXXX
- Log your code: the main.py contains the logger. It can be used in any module by juste using logging api
import logging
logging.info("my log message")
- The organisation inside your source code folder (src) is not mandatory, feel free to reorganize differently. Pay attention to how intuitive the code is.
- Use this README to provide information on how to use your code for other users or developpers
Don't do the job twice! Quinten is making effort to capitalize the work done on each project in order to improve efficency on the next projects. 2 main tools are available for capitalisation of code:
- SCOOLD: dedicated to on-the-flight capitalisation, it contains lot of snippets of code, answers regarding methodology questions and some domain specific Q&A. All this information is organised with tags. Please choose them carefully when you post a new question/answer
- NOTEBOOKS: it gathers notebooks that explain a full pipeline dedicated to a specific task (Propensity score evaluation, how to use model interpretation tools shap and lime, benchmark of clustering)
See details of how to use those tools
- .gitlab-ci.yaml: TO BE COMPLETED
- requirements.txt : file that contains all the packages and their version used for the project
Dans le dossier cicd
- Dockerfile: TO BE COMPLETED
- .dockerignore: TO BE COMPLETED
- build.sh: TO BE COMPLETED
- push.sh: TO BE COMPLETED
- run_test.sh: TO BE COMPLETED
- release.sh: TO BE COMPLETED
All detailed are provided here : If you want to generate code using Sphinx go to the last section of the Confluence page.