Contributing
Thanks for your interest in contributing to marketgoblin!
Setup
git clone https://github.com/aexsalomao/marketgoblin
cd marketgoblin
uv sync --extra dev
pre-commit install
Workflow
- Fork the repo and create a branch:
{name}_fix_{description}or{name}_dev_{description} - Make your changes
- Run the full check suite locally before pushing:
- Open a pull request against
master
CI runs the same checks automatically on every PR.
Adding a Data Source
Subclass BaseSource, declare supported datasets via _build_dispatch(), and register in goblin.py:
# src/marketgoblin/sources/mysource.py
import polars as pl
from marketgoblin import Dataset
from marketgoblin.sources.base import BaseSource, Fetcher
class MySource(BaseSource):
name = "mysource"
def _build_dispatch(self) -> dict[Dataset, Fetcher]:
return {Dataset.OHLCV: self._fetch_ohlcv}
def _fetch_ohlcv(self, symbol: str, start: str, end: str) -> pl.LazyFrame:
... # return a normalized LazyFrame with an is_adjusted column
# src/marketgoblin/goblin.py
_SOURCES = {"yahoo": YahooSource, "csv": CSVSource, "mysource": MySource}
Per-dataset fetchers all share the (symbol, start, end) signature. OHLCV fetchers return a tidy stacked frame containing both adjusted and raw rows distinguished by is_adjusted.
Adding a Dataset
Adding a new Dataset member is a four-step change:
- Extend
Datasetinsrc/marketgoblin/datasets.py - Add
_fetch_<dataset>to relevant sources and register it in their_build_dispatch() - Add
normalize_<dataset>in_normalize.pyandbuild_<dataset>in_metadata.py - Extend
DiskStorage._build_metadatadispatch instorage/disk.py
Add tests in tests/test_<thing>.py covering the happy path and error cases.
Code Style
- Ruff for linting and formatting (PEP 8, enforced via pre-commit)
- mypy for static type checking — all public functions must be fully annotated
- Polars over pandas everywhere
- Pure functions preferred; inject dependencies for testability
- No speculative abstractions — solve the problem at hand
Future contributions
Areas where help is especially welcome:
- Additional data sources (Polygon.io, Alpha Vantage, IBKR)
- Async
fetch()/fetch_many()support - CLI entrypoint
- More sophisticated missing-day logic (holiday calendars per exchange)