| title | Note Categorization and Classification | |
|---|---|---|
| description | This document describes how the OSM Notes Analytics project helps categorize and classify | |
| version | 1.0.0 | |
| last_updated | 2026-01-25 | |
| author | AngocA | |
| tags |
|
|
| audience |
|
|
| project | OSM-Notes-Analytics | |
| status | active |
Status: Documentation
Related Articles:
AngocA's Diary - Note Types
This document describes how the OSM Notes Analytics project helps categorize and classify OpenStreetMap notes. The project provides metrics and analytics that enable automatic and manual categorization of notes based on their characteristics, outcomes, and patterns.
The OSM Notes Analytics system helps categorize notes by:
- Analyzing note outcomes: Which notes were processed, simply closed, or need more data
- Identifying note types: Classifying notes based on their content and purpose
- Tracking patterns: Understanding how different types of notes behave over time
- Supporting resolution: Helping mappers prioritize and process notes effectively
Based on the comprehensive classification system described in AngocA's diary article on note types, notes can be categorized into two main groups:
These notes lead to actual changes in the map. They can be further classified as:
- Description: Notes indicating new places not yet mapped
- Examples: New restaurant, neighborhood name, complementing existing data (road surface, building address)
- Characteristics:
- Often created via assisted applications (Maps.me, StreetComplete, OrganicMaps, OnOSM.org)
- Clear, actionable descriptions
- Specific location information
- Description: Notes that correct existing map data
- Examples: Incorrect street name, wrong road surface (paved vs grass)
- Characteristics:
- Created by users aware of the map who want to improve it
- Value: Very valuable for map quality
- Description: Notes that help keep the map updated by removing outdated data
- Examples: Closed restaurant, changed business hours (no longer 24 hours)
- Characteristics:
- Users verify on-site and report discrepancies
- Reflect responsibility for keeping map current
- Common during events like COVID-19 (business closures)
- Description: Notes used for advertising purposes
- Examples: Overly detailed descriptions ("best Italian restaurant")
- Characteristics: Marketing language, excessive explanations
- Description: Notes referencing satellite imagery issues
- Examples: Cloud coverage preventing mapping, new imagery available
- Characteristics: May reference Bing imagery, Strava Heatmap, GPX traces, OpenAerialMap, etc.
- Description: Notes describing correctable problems in large areas
- Examples: Many buildings with "ele" instead of "height" tag
- Characteristics: Systematic errors affecting many features
- Description: Notes requiring extensive mapping
- Examples: Intermunicipal route following national route, missing river (kilometers of mapping)
- Characteristics: Large-scale changes
These notes should typically be closed without making map changes. They include:
- Description: Notes containing personal information
- Examples: "Casa de Andrés Gómez", phone numbers, "Casa de los Gómez", "casa mamá"
- Action: Should be closed and not added to map (privacy concern)
- Recommendation: Should be deletable/hideable to protect privacy
- Description: Notes with no content
- Action: Should be closed
- Description: Notes expressing opinions or perceptions
- Examples: "Nice place", "cozy place", "unsafe area", "robbery at night"
- Characteristics: Subjective, not mappable
- Description: Notes describing services that can't be mapped
- Examples: Hair salon services (men's cuts, children's cuts, manicure), restaurant menu
- Characteristics: Information that doesn't belong in OSM
- Description: Notes promoting services or quality
- Examples: Service quality descriptions, promotions
- Action: Should be closed (doesn't contribute to map)
- Description: Notes indicating changes already mapped
- Causes:
- Notes not processed in time
- Conditions changed for other reasons
- Satellite imagery or Strava Heatmap shows changes already reflected
- Action: Should be closed
- Description: Notes created without proper pin location
- Examples: Pin in middle of road when note refers to building interior (shop)
- Action: Request more details and close
- Characteristics: Common issue, needs clarification
- Description: Notes created due to device positioning issues, not map problems
- Characteristics: User believes it's a map problem, but it's a device issue
- Description: Notes indicating what's already in the map
- Examples: City or town names already mapped
- Action: Should be closed
- Description: Notes that don't contribute to the map
- Characteristics: Unclear purpose, no actionable information
- Description: Notes indicating map problems but lacking details for correction
- Examples: Missing river without route details, area-based features (postal codes)
- Characteristics: Problem exists but can't be mapped without more information
The OSM Notes Analytics system provides metrics that help identify note types:
avg_days_to_resolution: Notes that take longer may be problematicresolution_rate: Country/user patterns indicate note qualitynotes_still_open_count: Backlog indicates unresolved issues
comment_length: Very short notes may be empty or lack precisionhas_url: Notes with URLs may be advertising or have more contexthas_mention: Notes with mentions may need collaborationavg_comments_per_note: High comment count may indicate discussion or lack of clarity
user_response_time: Fast responders may handle different note typesnotes_opened_but_not_closed_by_user: Users who report but don't resolvecollaboration_patterns: Notes requiring collaboration
applications_used: Notes from assisted apps (Maps.me, etc.) are more likely to be actionablemobile_apps_countvsdesktop_apps_count: Different note types from different platforms
notes_health_score: Overall community note qualitynew_vs_resolved_ratio: Balance between new notes and resolutionsnotes_age_distribution: Old notes may be obsolete or problematic
-- Notes with lack of precision or abstract descriptions
SELECT
id_note,
opened_dimension_id_date,
comment_length,
has_url,
has_mention,
total_comments_on_note
FROM dwh.facts
WHERE action_comment = 'opened'
AND comment_length < 50 -- Very short notes
AND total_comments_on_note > 2 -- Multiple comments (discussion)
ORDER BY opened_dimension_id_date DESC;-- Notes from assisted applications with good content
SELECT
f.id_note,
f.opened_dimension_id_date,
f.comment_length,
f.has_url,
a.application_name
FROM dwh.facts f
JOIN dwh.dimension_applications a
ON f.dimension_application_creation = a.dimension_application_id
WHERE f.action_comment = 'opened'
AND a.application_name IN ('Maps.me', 'StreetComplete', 'OrganicMaps', 'OnOSM.org')
AND f.comment_length > 30 -- Has sufficient description
ORDER BY f.opened_dimension_id_date DESC;-- Notes with characteristics of non-actionable types
SELECT
f.id_note,
f.opened_dimension_id_date,
f.comment_length,
f.has_url,
dc.notes_health_score
FROM dwh.facts f
JOIN dwh.datamartCountries dc
ON f.dimension_id_country = dc.dimension_country_id
WHERE f.action_comment = 'opened'
AND (
f.comment_length < 20 -- Very short (empty or minimal)
OR (f.comment_length > 200 AND f.has_url) -- Long with URL (advertising)
)
AND f.total_comments_on_note = 0 -- No discussion
ORDER BY f.opened_dimension_id_date DESC;-- Notes that are very old and still open
SELECT
f.id_note,
f.opened_dimension_id_date,
EXTRACT(DAY FROM CURRENT_DATE - d.date_id) as days_open,
f.total_comments_on_note
FROM dwh.facts f
JOIN dwh.dimension_days d
ON f.opened_dimension_id_date = d.dimension_day_id
WHERE f.action_comment = 'opened'
AND NOT EXISTS (
SELECT 1
FROM dwh.facts f2
WHERE f2.id_note = f.id_note
AND f2.action_comment = 'closed'
)
AND EXTRACT(DAY FROM CURRENT_DATE - d.date_id) > 180 -- More than 6 months
ORDER BY days_open DESC;The analytics system supports note resolution campaigns by:
-
Identifying Priority Notes:
- Notes that contribute with changes (high priority)
- Notes needing more data (medium priority)
- Notes that should be closed (low priority)
-
Tracking Campaign Progress:
- Resolution rates by country
- Notes resolved vs created
- Community health scores
-
Understanding Patterns:
- Which note types are most common
- Which applications generate most actionable notes
- User behavior patterns
-
Resource Allocation:
- Focus efforts on notes that will have impact
- Identify areas needing more mappers
- Track resolution efficiency
-
Tipos de notas (AngocA's Diary)
- Comprehensive classification of note types
- Examples and characteristics of each type
- Basis for this categorization system
-
Manipulación de notas (AngocA's Diary)
- How to create, view, and resolve notes
- Tools and workflows for note management
- Visual examples of note workflows
-
Análisis de notas (AngocA's Diary)
- Analysis techniques for notes
- Patterns and insights
-
Técnicas de creación y resolución de notas (AngocA's Diary)
- Best practices for creating notes
- Resolution techniques and strategies
-
Proyecto de resolución de notas - Preparación premios (OSM Wiki)
- Note resolution project documentation
- Campaign organization and recognition
- Metric Definitions: Complete reference for all metrics
- Dashboard Analysis: Available metrics and dashboards
- Use Cases and Personas: User scenarios and queries
- ML Implementation Plan: Automated note classification using ML
- Identify actionable notes: Find notes that will lead to map improvements
- Prioritize work: Focus on notes that contribute with changes
- Understand patterns: Learn which note types are common in your area
- Organize campaigns: Use metrics to plan note resolution campaigns
- Track progress: Monitor resolution rates and community health
- Identify issues: Find areas with problematic note patterns
- Analyze note types: Understand distribution of note categories
- Study patterns: Identify trends in note creation and resolution
- Measure impact: Track how different note types affect map quality
The ML Implementation Plan describes plans for automated note classification using machine learning:
- Action prediction: Will note be processed, closed, or need more data?
- Type classification: Automatically categorize notes by type
- Priority scoring: Identify high-priority notes automatically
Future metrics could include:
- Note type distribution: Percentage of each note type
- Resolution success rate: By note type
- Time to resolution: By note type
- User expertise: By note type handled
Maintained By: OSM Notes Analytics Project
Contributions: Based on AngocA's comprehensive note classification system