Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
[0.4.0] - 2026-04-20
Added
marketgoblin.sector_indices— public module exposing refreshable US GICS sector → index/ETF mappingsSectorIndexMapping,SectorIndex,IndustryGroup,Industry,SubIndustrydataclasses — full 4-level GICS tree (sector → industry group → industry → sub-industry) with per-level GICS codes andconstituent_countload_sector_indices(market="US")— read the shipped JSON snapshot (src/marketgoblin/_sector_indices_data/us.json)refresh_sector_indices(market="US", output_path=None)— re-run the parser against the S&P 500 Wikipedia constituents page and rewrite the snapshot- Curated GICS 2023 taxonomy shipped as
gics_taxonomy_us.json(11 sectors, 25 industry groups, 73 industries, 163 sub-industries) — parser joins scraped constituents against it and rolls counts up the hierarchy; unknown upstream sub-industries fail loud TODO.mdroadmap at the repo root tracking coverage phases (US index families, international markets) and parser hardeninghypothesis>=6.100added to thedevextra — powers property tests for rollup invariants (sum-of-children == parent at every level) and JSON roundtrip
[0.3.0] - 2026-04-20
Added
TickerMetadatadataclass — unified, source-agnostic ticker profile collapsing yfinance'sinfo/fast_info/history_metadata/isininto one shapeClassification,SectorProfile,IndustryProfiledataclasses — sector + industry classification for a ticker viayf.Sector/yf.IndustryMarketGoblin.fetch_metadata(symbol, *, fast=False)/load_metadata(symbol)— live-fetch or disk-load ticker metadataMarketGoblin.fetch_classification(symbol)/load_classification(symbol)— live-fetch or disk-load sector + industry classificationYahooSource.fetch_metadata()andfetch_classification()— yfinance-backed implementations with retry/backoff; classification parallelizes sector + industry lookupsDiskStorage.save_metadata/load_metadata/save_classification/load_classification— JSON persistence at{provider}/metadata/{SYMBOL}.jsonand{provider}/classification/{SYMBOL}.jsonJSONSerializablemixin (_serialization.py) — sharedto_dict/from_dictfor JSON-backed dataclasses; tolerates unknown keys on load
Changed
YahooSourcesplit into orchestration (yahoo.py) and pure adapter/parser helpers (_yahoo_parsing.py)_metadata.write()generalized to accept any target path and create parent dirs (was sidecar-only)
[0.2.0] - 2026-04-20
Added
Datasetenum (OHLCV,SHARES,DIVIDENDS) exported from the package root for dataset selection- Shares-outstanding dataset via Yahoo (
yfinance.Ticker.get_shares_full) — sparse, corporate-action-driven series deduplicated to one row per day - Dividends dataset via Yahoo (
yfinance.Ticker.dividends) — event-driven series filtered to the requested date range is_adjusted: boolcolumn on OHLCV frames — adjusted and raw variants now live in a single tidy stacked seriesMarketGoblin.supported_datasetsproperty exposing the datasets a provider supportsdataset=parameter onfetch(),load(), andfetch_many()(defaults toDataset.OHLCV— existing callers unchanged)CSVSource(is_adjusted=...)init kwarg stamps the variant flag on every row (CSVs hold a single variant by assumption)normalize_shares(),normalize_dividends()in_normalize.pyandbuild_shares(),build_dividends()in_metadata.py- Uniform dataset-aware path scheme in
DiskStorage:{provider}/{dataset}/{SYMBOL}/{SYMBOL}_{YYYY-MM}.pq— noadjusted|rawsegment for any dataset
Changed
- Per-source dataset dispatch: sources declare supported datasets via
_build_dispatch();BaseSource.fetch()takes aDatasetas its first argument - OHLCV is fetched in a single
yf.Ticker.history(auto_adjust=False)call — adjusted Open/High/Low are derived locally via theAdj Close / Closeratio (zero numerical drift vs yfinance'sauto_adjust=True, half the network calls) - OHLCV metadata sidecar:
price_adjustedreplaced byhas_adjusted/has_raw; missing-days analysis now runs on unique dates; newunique_daysfield
Removed
adjustedparameter fromMarketGoblin.fetch()/load()/fetch_many(),BaseSource.fetch(), per-datasetFetchersignature, andDiskStorage.save()/load()— OHLCV variants are distinguished by theis_adjustedcolumn instead (breaking)
[0.1.2] - 2026-04-17
Removed
- Undocumented
report=Trueoption onMarketGoblinand thedownload_report.csvsidecar — not part of the public API surface defined in.claude/rules/project.md
Changed
- File header comments on every module per
code-style.mdrule 10 - Flattened
forloops in tests to comply withtesting.mdrule 32 (no logic in tests) - Fixed volume dtype in test fixtures (
Float32→Int64) andfile_size_bytesarg type intest_metadata.py
[0.1.1] - 2026-04-16
Added
- Retry logic with exponential backoff in
YahooSource.fetch()(3 attempts, 1 s / 2 s delays) - Rate limiting in
fetch_many()via a token-bucket_RateLimiter(default: 2 req/s) - Input validation for date format and ordering in
fetch(),load(), andfetch_many() CSVSource— a file-backed OHLCV source for local CSV data**source_kwargsforwarding inMarketGoblin.__init__()for provider-specific options- Documentation site at aexsalomao.github.io/marketgoblin
- Automated PyPI publish workflow via GitHub Actions Trusted Publishing (OIDC)
- Ruff linting + formatting, mypy strict type checking, pre-commit hooks
- GitHub Actions CI workflow (lint → format → typecheck → test → Codecov)
Changed
- Volume column dtype changed from
float32toint64for accuracy
[0.1.0] - 2026-04-16
Added
- Initial release
MarketGoblinpublic API facade (fetch,load,fetch_many)YahooSourcebacked by yfinanceDiskStorage— monthly Parquet slices with atomic writes and JSON sidecarsnormalize()andparse_dates()in_normalize.pybuild()andwrite()metadata helpers in_metadata.py- 33 unit tests across all modules