Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
19d1b5c
chore: setup devcontainer and project structure
FilippoMarletta Jan 18, 2026
4088a4b
feat: implement get_departments and models
FilippoMarletta Jan 19, 2026
f783e35
chore: add .gitattributes to enforce LF line endings
FilippoMarletta Jan 19, 2026
171de79
build: add pytest-mock and pytest-cov to requirements.txt
FilippoMarletta Jan 21, 2026
a4d2683
feat: add get_courses function and update models for course data hand…
FilippoMarletta Jan 21, 2026
3a8b6b3
feat: add tests for get_departments and get_courses functions
FilippoMarletta Jan 21, 2026
288d224
feat: implement parse_course_name function and add corresponding tests
FilippoMarletta Jan 21, 2026
54abfe1
chore: update .gitignore to include coverage and cache files
FilippoMarletta Jan 21, 2026
519220d
build: update vscode extensions
FilippoMarletta Jan 24, 2026
db67ba1
fix: prevents pylance from crushing
FilippoMarletta Jan 27, 2026
47f19f9
feat: add tests for get_activities and parse_insegnamento_data functions
FilippoMarletta Jan 27, 2026
a5ea68f
feat: add get_activities function, insegnamento dataclass and parsing…
FilippoMarletta Jan 27, 2026
50068a3
feat: add SchedaOpis dataclass
FilippoMarletta Feb 5, 2026
f7d7893
feat: add function parse_scheda_opis and correspondig tests
FilippoMarletta Feb 7, 2026
1c85d9c
feat: add get_questions function and update related models and transf…
FilippoMarletta Mar 1, 2026
de2e526
feat: implement scraper functionality with logging and data processing
FilippoMarletta Mar 2, 2026
20696ad
refactor: use requests.Session to improve scraping speed
FilippoMarletta Mar 2, 2026
d73c5e1
feat: enhance API client with logging and timeout management
FilippoMarletta Mar 8, 2026
b08d43f
feat: extend SchedaOpis model with additional and previously fields
FilippoMarletta Mar 8, 2026
99e4c04
build: add mysql-connector and python-dotenv dependecies and update c…
FilippoMarletta Mar 8, 2026
29f402a
fix: update parse_course_name regex and rewrite parse_scheda_opis to …
FilippoMarletta Mar 8, 2026
de010a4
feat: implement database connection and CRUD operations for departmen…
FilippoMarletta Mar 8, 2026
5afaabc
feat: enhance scraper functionality with concurrent processing and im…
FilippoMarletta Mar 8, 2026
b6290fb
fixt: update parse_course_name regex for improved matching and add co…
FilippoMarletta Mar 8, 2026
47502a5
fix: ensure professor names default to empty string if not present in…
FilippoMarletta Mar 8, 2026
0092fd2
feat: add random sampling of activities and departments in debug mode…
FilippoMarletta Mar 8, 2026
4778821
fix: update parse_course_name regex to support 'c.u.' and 'cu' format…
FilippoMarletta Mar 9, 2026
3a85d1c
fix: add previously missing nome_modulo field to Insegnamento model a…
FilippoMarletta Mar 9, 2026
6c833c6
refactor: streamline parse_scheda_opis_data function by removing unus…
FilippoMarletta Mar 10, 2026
31b9759
fix: update mock API calls to use session.post and adjust test data f…
FilippoMarletta Mar 10, 2026
0ff9519
fix: temporary solution to handle missing or invalid activity codes
FilippoMarletta Mar 14, 2026
e028436
chore: load DEBUG_MODE from .env for better configuration management
FilippoMarletta Mar 14, 2026
689c257
feat: add additional case for alfanumeric activityCode
FilippoMarletta Mar 14, 2026
5264dfc
feat: add unit tests for database.py functions
FilippoMarletta Mar 14, 2026
60f3987
feat: enhance error handling in database insertion functions and add …
FilippoMarletta Mar 16, 2026
0758720
chore: update parse_course_name to accept optional full_name and enha…
FilippoMarletta Mar 16, 2026
d98d523
build: update devcontainer settings for improved Python development e…
FilippoMarletta Mar 18, 2026
349b1e1
fix: update process_activity and process_course function signatures t…
FilippoMarletta Mar 18, 2026
35b8edf
fix: add missing space in postCreateCommand
FilippoMarletta Mar 19, 2026
68ec559
feat: add CI workflow for Python testing
FilippoMarletta Mar 19, 2026
c92f995
fix: normalize case for "Non Frequentanti" in parse_scheda_opis_data …
FilippoMarletta Mar 19, 2026
07bb0fc
fix: update mock_opis_json to include "Studenti Non Frequentanti" dat…
FilippoMarletta Mar 19, 2026
d221b1c
feat: add tests for database connection failure and handling inserts …
FilippoMarletta Mar 19, 2026
69f14ce
chore: update checkout action version to v5 in CI workflow
FilippoMarletta Mar 19, 2026
1a77886
chore: update checkout action version to v6 in CI workflow
FilippoMarletta Mar 19, 2026
2dd88e1
chore: update setup-python action version to v6 in CI workflow
FilippoMarletta Mar 19, 2026
90e8af1
fix: rename test execution step for clarity in CI workflow
FilippoMarletta Mar 19, 2026
d2822b9
ci: update CI workflow for linting, type checking and testing
FilippoMarletta Mar 20, 2026
d218dda
style: linting with black
FilippoMarletta Mar 20, 2026
3e86721
ci: set minimum coverage threshold to 80%
FilippoMarletta Mar 20, 2026
e9396e3
docs: add README.md
FilippoMarletta Mar 20, 2026
dcd244e
chore: add .env.example
FilippoMarletta Mar 23, 2026
89a476a
docs: update environment variables section
FilippoMarletta Mar 23, 2026
07996c1
fix: add missing field in Insegnamento
FilippoMarletta Mar 24, 2026
d74f15f
feat: add assign_channels function
FilippoMarletta Mar 24, 2026
fd43c8b
test: add tests for assign_channels function and adapt previous tests…
FilippoMarletta Mar 24, 2026
c85fefe
fix: more general regex for parse_course_name
FilippoMarletta Mar 25, 2026
cb1a547
tests: add 2 test cases for test_parse_course_name
FilippoMarletta Mar 25, 2026
8d8c150
feat: enriches debug mode with more customization
FilippoMarletta Mar 25, 2026
d2e0669
style: black linting
FilippoMarletta Mar 25, 2026
f6291b2
ci: add pylint, flake8, mypy e isort to CI pipeline and split require…
FilippoMarletta Mar 26, 2026
d565ebc
feat: add .pylintrc
FilippoMarletta Mar 26, 2026
cf7d43a
chore: add Makefile for quick linting checks
FilippoMarletta Mar 26, 2026
10ababa
chore: update devcontainer to python 3.14 and postCreateCommand
FilippoMarletta Mar 26, 2026
928bb93
style: fix some linting issues across multiple files
FilippoMarletta Mar 26, 2026
3736359
refactor: extract _process_cluster_data and _process_graph_pie from p…
FilippoMarletta Mar 26, 2026
5266bfa
Refactor: fix strict linting and typing issues
FilippoMarletta Mar 26, 2026
b519687
ci: update isort command to run with black profile
FilippoMarletta Mar 26, 2026
03cea9c
refactor: add missing return types and make insert_schede_opis return…
FilippoMarletta Mar 26, 2026
653b295
Merge branch 'master' into py_scraper
FilippoMarletta Mar 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,4 @@ jobs:
run: black --check --diff $(git ls-files '*.py')

- name: Check code formatting with isort
run: isort --check-only --diff $(git ls-files '*.py')
run: isort --profile black --check-only --diff $(git ls-files '*.py')
39 changes: 39 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: CI

on:
push:
branches: [ main, master, py_scraper]
pull_request:
branches: [ main, master ]

jobs:
test:
runs-on: ubuntu-latest

defaults:
run:
working-directory: ./python_scraper

steps:
- name: Checkout del codice
uses: actions/checkout@v6

- name: Setup Python 3.14
uses: actions/setup-python@v6
with:
python-version: "3.14"
cache: "pip"

- name: Installazione dipendenze
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements_dev.txt ]; then pip install -r requirements_dev.txt; fi
- name: Controllo Tipizzazione
run: |
pyright src tests
- name: Esecuzione dei test con Coverage
run: |
pytest --cov=src --cov-report=term-missing --cov-fail-under=80
44 changes: 44 additions & 0 deletions python_scraper/.devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
{
"name": "OPIS Python Scraper",
"image": "mcr.microsoft.com/devcontainers/python:3.14",
"customizations": {
"vscode": {
"settings": {
"python.defaultInterpreterPath": "/usr/local/bin/python",
"python.languageServer": "Pylance",
"python.analysis.nodeExecutable": "auto",
"python.analysis.typeCheckingMode": "standard",
"python.analysis.autoImportCompletions": true,
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
"python.formatting.provider": "none",
"black-formatter.importStrategy": "fromEnvironment",
"black-formatter.path": [
"black"
],
"python.analysis.exclude": [
"**/__pycache__",
"**/.venv",
"**/node_modules",
"**/dist",
"**/build"
],
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
"."
]
},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"ms-python.debugpy",
"ms-python.black-formatter",
"njpwerner.autodocstring",
"KevinRose.vsc-python-indent",
"GitHub.copilot-chat"
]
}
},
"postCreateCommand": "sudo pip install --upgrade pip --root-user-action=ignore && pip install -r requirements.txt -r requirements_dev.txt",
"remoteUser": "vscode"
}
10 changes: 10 additions & 0 deletions python_scraper/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
DB_HOST= 127.0.0.1
DB_PORT= 3306
DB_DATABASE= opis_manager
DB_USERNAME=root
DB_PASSWORD=

DEBUG_MODE=False
DEBUG_NUM_ACTIVITIES=5
DEBUG_NUM_COURSES=1
DEBUG_NUM_DEPARTMENTS=1
2 changes: 2 additions & 0 deletions python_scraper/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Usa sempre i fine riga stile Linux (LF) quando committi e scarichi.
* text=auto eol=lf
39 changes: 39 additions & 0 deletions python_scraper/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual environments
venv/
env/
ENV/

.vscode/

.coverage
.pytest_cache/
.mypy_cache/
calc_cov/

.env
Loading