Skip to content

IOST-ASCOL/nepali-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Welcome to the Largest Nepali Datasets Collection

This is the most comprehensive repository of Nepali datasets available on GitHub. We aggregate and curate machine learning datasets for the Nepali language from multiple open sources.

Note: This repository aggregates publicly available datasets from open sources. If any content belongs to you, please submit a PR or contact us to request removal.

📋 Table of Contents

  1. Overview
  2. Text & NLP Datasets
  3. Audio & Speech Datasets
  4. Image Datasets
  5. Geospatial & Location Datasets
  6. Time Series & Real-Time Data
  7. Financial & Economic Datasets
  8. Specialized Datasets
  9. Embedding & Representation Learning
  10. Public Data Sources
  11. Related NLP Research & Tools
  12. Nepali Literature Dataset
  13. Nepali MNIST
  14. Contributing

Text & NLP Datasets

News & General Text Corpus

Wikipedia & Reference

Large Scale Text Corpora

Machine Translation & Parallel Corpora

Sentiment Analysis

Named Entity Recognition (NER)

Text Summarization

Literary & Cultural Text

Specialized Text Data


Audio & Speech Datasets

Text-to-Speech

Automatic Speech Recognition (ASR)

Character Speech

Speech Embeddings


Image Datasets

Handwritten Characters & Recognition

License Plate Recognition

General Images

Currency Recognition


Geospatial & Location Datasets

Maps & Geography


Time Series & Real-Time Data

Air Quality

Weather

Hydrology & Environment

Market Data


Financial & Economic Datasets

Stock Market

Currency Exchange


Specialized Datasets

Disaster & Emergency

Health


Embedding & Representation Learning

Word Embeddings


Public Data Sources


Related NLP Research & Tools


Nepali literature dataset


Nepali MNIST

Contributing

🤝 Contributing

Found a Nepali dataset? Help us grow this collection!

How to Add:

  1. Star the repository
  2. Fork the repository
  3. Add dataset to README.md in the right category
  4. Format: - **Name** - [Link](url)
  5. Submit a PR

Requirements:

  • Publicly available dataset
  • Working direct link
  • Brief description (1-2 sentences)
  • Proper attribution

We review PRs within 48 hours. Thanks! 🎉

About

A comprehensive, curated collection of Nepali datasets aggregated from community sources and research repositories. Includes ML, NLP, audio, images, and geospatial data for the Nepali language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors