GSCI (Global Severity-weighted Conflict Index) is a daily, event-based measure of global conflict intensity constructed from the GDELT Project. It aggregates conflict-related events coded under the CAMEO taxonomy and weights them by Goldstein severity scores, providing a real-time, multilingual, and open-source complement to perception-based geopolitical risk indicators such as the GPR index of Caldara and Iacoviello (2022).
This repository provides:
gsci_data.csv— the full daily GSCI series from March 2015 to present, including the rawtotal_sourcesdenominatorindex.html— an interactive dashboard (live at jyingnan.github.io/GSCI-Tracker)events.json— annotated geopolitical event database for dashboard overlaysupdate_gsci.py— automated weekly update script via BigQuery.github/workflows/update.yml— GitHub Actions workflow for scheduled updates
GSCI is defined as:
| Symbol | Definition |
|---|---|
| Set of conflict events with strictly negative Goldstein scores on day |
|
| Absolute Goldstein severity weight of event |
|
| Number of news sources reporting event |
|
Total number of news sources on day total_sources) |
Source-share normalization by
File: gsci_data.csv
| Column | Type | Description |
|---|---|---|
date |
string (YYYY-MM-DD) | Calendar date |
total_sources |
integer | Total number of news sources on day |
gsci |
float | Daily GSCI value (severity-weighted conflict intensity) |
- Coverage: March 1, 2015 – present
- Frequency: Daily
- Update schedule: Every Monday (automated via GitHub Actions + Google BigQuery)
- Source: GDELT v2 Event Stream
The total_sources column records the GSCI denominator
- Holiday Effect — global news production declines on public holidays, reducing GDELT ingestion volume.
- GDELT Crawler Outages — GDELT's infrastructure occasionally experiences 12–24 hour data gaps due to Google Cloud sync delays or external crawl rate-limit events.
- Major News Crowding — a dominant breaking story can compress coverage of other events, skewing source distribution.
- Deduplication & Versioning Changes — GDELT's internal deduplication logic or versioning updates can cause step-changes in reported source counts.
Recommended practice: Days where total_sources < 10,000 should be interpreted with caution. The interactive dashboard flags these dates with amber highlights and a tooltip warning. For robust analysis, consider excluding such observations or applying a threshold appropriate to your use case.
An interactive dashboard is available at:
🔗 https://jyingnan.github.io/GSCI-Tracker/
Features:
- Time range selector (1Y / 3Y / 5Y / All)
- Toggle for geopolitical event annotations (Armed Conflict / Terrorism / Political Crisis / Domestic)
- Low-source highlight toggle — flags days with
total_sources < 10,000in amber - Hover tooltip showing GSCI value, sample mean, and raw source count (with warning if below threshold)
- 1-year peak stat (calculated over clean-data days only, excluding low-source observations)
- Light / dark theme
- Downloadable CSV
To recompute GSCI from scratch using Google BigQuery:
1. Install dependencies
pip install pandas google-cloud-bigquery db-dtypes2. Set up GCP credentials
Create a service account with BigQuery read access and export the key as an environment variable:
export GCP_SERVICE_ACCOUNT_KEY='<your_json_key>'3. Run the update script
python update_gsci.pyThe script queries the gdelt-bq.gdeltv2.events table, applies the GSCI formula, and overwrites gsci_data.csv with the full series including total_sources.
The repository uses GitHub Actions for weekly updates:
- Trigger: Every Monday at 00:00 UTC (or manual dispatch via
workflow_dispatch) - Process: Authenticates to GCP → queries BigQuery → overwrites
gsci_data.csv→ commits and pushes - Required secret: Add
GCP_SERVICE_ACCOUNT_KEY(full JSON) in your repository's Settings → Secrets and variables → Actions
GSCI is designed as an event-based complement to the Geopolitical Risk (GPR) index of Caldara and Iacoviello (2022), not a replacement. The two indices measure different dimensions:
| GSCI | GPR | |
|---|---|---|
| Basis | Recorded conflict events (GDELT) | Editorial newspaper coverage |
| Languages | 100+ languages | English only (10 newspapers) |
| Update frequency | Weekly (near real-time) | Weekly |
| Historical coverage | 2015–present | 1900–present |
| Nature | Event severity | Perceived risk |
If you use GSCI data in your research, please cite:
[Author(s)]. "From Ten Newspapers to the World: An Event-Based Global Conflict Intensity Index." Working Paper, 2026. GitHub: https://github.com/jyingnan/GSCI-Tracker
The GSCI data series is also directly citable as a dataset:
[Author(s)]. Global Severity-weighted Conflict Index (GSCI). Daily data, March 2015–present. Available at: https://github.com/jyingnan/GSCI-Tracker
- Caldara, D., Iacoviello, M. (2022). Measuring geopolitical risk. American Economic Review, 112(4), 1194–1225.
- GDELT Project: https://www.gdeltproject.org/
This dataset and associated code are released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the material for any purpose, including commercial use, provided appropriate credit is given.
The underlying GDELT data is made available by the GDELT Project under its own open access terms.
GSCI(全球冲突严重程度加权指数,Global Severity-weighted Conflict Index)是一个基于事件的每日全球冲突强度指标,数据来源于 GDELT 项目。该指数汇总了 CAMEO 编码框架下的冲突相关事件,并以 Goldstein 严重程度分数进行加权,为 Caldara 和 Iacoviello(2022)的 GPR 指数等基于文本感知的地缘政治风险指标提供实时、多语言、开源的事件侧补充。
本仓库提供:
gsci_data.csv— 2015 年 3 月至今的完整每日 GSCI 时间序列,含原始分母total_sourcesindex.html— 交互式可视化面板(在线访问:jyingnan.github.io/GSCI-Tracker)events.json— 用于面板标注的地缘政治重大事件数据库update_gsci.py— 通过 BigQuery 自动更新的脚本.github/workflows/update.yml— 定时更新的 GitHub Actions 工作流
GSCI 的计算公式为:
| 符号 | 含义 |
|---|---|
| 第 |
|
| 事件 |
|
| 第 |
|
第 total_sources) |
以
文件: gsci_data.csv
| 字段 | 类型 | 说明 |
|---|---|---|
date |
字符串(YYYY-MM-DD) | 日期 |
total_sources |
整数 | 第 |
gsci |
浮点数 | 当日 GSCI 值(严重程度加权冲突强度) |
- 覆盖时间: 2015 年 3 月 1 日至今
- 频率: 每日
- 更新周期: 每周一自动更新(GitHub Actions + Google BigQuery)
- 数据来源: GDELT v2 事件流
total_sources 字段记录 GSCI 分母
- 假日效应(Holiday Effect) — 公共假日期间全球新闻产量下降,GDELT 抓取量随之减少。
- GDELT 爬虫断档 — GDELT 基础设施偶发 12–24 小时的数据缺口,通常与 Google Cloud 同步延迟或外部抓取频率限制有关。
- 重大事件新闻挤压 — 单一突发性重大事件可压缩其他新闻的报道量,导致来源分布偏斜。
- 去重与版本逻辑变更 — GDELT 内部去重逻辑或版本更新有时会造成报告来源数的阶跃变化。
建议实践: total_sources < 10,000 的日期应谨慎解读。交互式面板会以琥珀色色带标注此类日期,并在悬停提示中显示警告。进行严谨分析时,建议剔除此类观测值,或根据研究目的自行设定合适的阈值。
交互式可视化面板:
🔗 https://jyingnan.github.io/GSCI-Tracker/
主要功能:
- 时间区间选择(近 1 年 / 3 年 / 5 年 / 全部)
- 地缘政治事件标注开关(武装冲突 / 恐怖袭击 / 政治危机 / 国内冲突)
- 低来源标注开关 — 以琥珀色高亮
total_sources < 10,000的日期 - 鼠标悬停提示,显示当日 GSCI 值、样本均值及原始来源数(低于阈值时附加警告)
- 近一年峰值统计(仅基于来源数正常的日期计算)
- 亮色 / 暗色主题切换
- CSV 数据下载
如需从头通过 Google BigQuery 重新计算 GSCI:
1. 安装依赖
pip install pandas google-cloud-bigquery db-dtypes2. 配置 GCP 凭据
创建具有 BigQuery 读取权限的服务账号,并将密钥导出为环境变量:
export GCP_SERVICE_ACCOUNT_KEY='<your_json_key>'3. 运行更新脚本
python update_gsci.py脚本将查询 gdelt-bq.gdeltv2.events 表,应用 GSCI 公式,并以完整序列(含 total_sources)覆盖写入 gsci_data.csv。
本仓库通过 GitHub Actions 实现每周自动更新:
- 触发条件: 每周一 UTC 00:00(或通过
workflow_dispatch手动触发) - 流程: GCP 认证 → 查询 BigQuery → 覆盖写入
gsci_data.csv→ 提交并推送 - 所需密钥: 在仓库 Settings → Secrets and variables → Actions 中添加
GCP_SERVICE_ACCOUNT_KEY(完整 JSON 格式)
GSCI 被设计为 Caldara 和 Iacoviello(2022)GPR 指数的事件侧补充,而非替代。两者衡量不同维度:
| GSCI | GPR | |
|---|---|---|
| 构建基础 | 实际冲突事件记录(GDELT) | 报纸编辑内容 |
| 语言覆盖 | 100+ 种语言 | 仅英语(10 家报纸) |
| 更新频率 | 每周(近实时) | 每周 |
| 历史覆盖 | 2015 年至今 | 1900 年至今 |
| 指标性质 | 事件严重程度 | 风险感知 |
如在研究中使用 GSCI 数据,请引用:
[作者]. "From Ten Newspapers to the World: An Event-Based Global Conflict Intensity Index." Working Paper, 2026. GitHub: https://github.com/jyingnan/GSCI-Tracker
如直接引用数据集:
[作者]. 全球冲突严重程度加权指数(GSCI). 每日数据,2015 年 3 月至今. 获取地址:https://github.com/jyingnan/GSCI-Tracker
- Caldara, D., Iacoviello, M. (2022). Measuring geopolitical risk. American Economic Review, 112(4), 1194–1225.
- GDELT 项目官网:https://www.gdeltproject.org/
本数据集及相关代码采用 知识共享署名 4.0 国际许可协议(CC BY 4.0) 发布。您可以自由使用、共享和改编本材料(包括商业用途),但须注明原始出处。
底层 GDELT 数据由 GDELT 项目依其开放访问条款提供。