Welcome to Cached Historical Data Fetcher documentation!
Cached Historical Data Fetcher
Python utility for fetching any historical data using caching. Suitable for acquiring data that is added frequently and incrementally, e.g. news, posts, weather, etc.
Installation
Install this via pip (or your favourite package manager):
pip install cached-historical-data-fetcher
Features
Ready to use with
asyncio
,aiohttp
,aiohttp-client-cache
. Usesasyncio.gather
for fetching chunks in parallel. (For performance reasons, only usingaiohttp-client-cache
is probably not a good idea when fetching large number of chunks (web requests).)Based on
pandas
and supportsMultiIndex
.
Usage
Override get_one
method to fetch data for one chunk. update
method will call get_one
for each chunk and concatenate results.
from cached_historical_data_fetcher import HistoricalDataCacheWithFixedChunk
from pandas import DataFrame, Timedelta, Timestamp
class MyCacheWithFixedChunk(HistoricalDataCacheWithFixedChunk):
delay_seconds: float = 0 # delay between chunks
interval: Timedelta = Timedelta(days=1) # chunk interval
start_init: Timestamp = Timestamp.utcnow().floor("10D") # start date
async def get_one(self, start: Timestamp, *args: Any, **kwargs: Any) -> DataFrame:
return DataFrame({"day": [start.day]}, index=[start])
df = await MyCacheWithFixedChunk().update()
day
2023-09-30 00:00:00+00:00 30
2023-10-01 00:00:00+00:00 1
2023-10-02 00:00:00+00:00 2
See example.ipynb for real-world example.
Contributors ✨
Thanks goes to these wonderful people (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!