3.6 KiB
GIS and Weather Data Processing - Initial Questions
Core Questions
- What specific GIS formats and operations are most critical for your current projects?
Well I am not entirely sure. I am guessing that I'll need to pull in shapefiles ? I will be working with an entirely self hosted GIS stack (not Google maps or anything). I know things exist like gdal ? tippacanoe?
I think things like parquet as well. Maybe duckdb?
Reference these posts:
https://tech.marksblogg.com/american-solar-farms.html https://tech.marksblogg.com/canadas-odb-buildings.html https://tech.marksblogg.com/ornl-fema-buildings.html
FOr the type of workflows that I would like to run.
Extract patterns/architecture/approaches along with the specific reductions to practice.
- What weather data sources and APIs do you currently use or plan to use?
None currently. But I'll be hacking/forking a system to predict balloon paths. I suspect I'll need to process grib data. Also probably use the NOAA and european equivalant APIs? Maybe some bulk HTTP/FTP download?
- Are there any specific performance requirements for processing large datasets?
I suspect I'll do some early prototyping with small data sets on my workstation and then running the container with the real data sets on my big ram/cpu/disk servers.
- Do you need integration with specific databases (PostGIS, etc.)?
Yes I will be heavily using PostGIS for sure.
Technical Questions
- Should we include both Python and R stacks in the same containers or separate them?
I am not sure? Whatever you think is best?
- What level of visualization capability is needed (command-line, web-based, desktop)?
All of those I think. I want flexibility.
- Are there any licensing constraints or requirements to consider?
I will be working only with public data sets.
- Do you need GPU support for any processing tasks?
Yes but make it optional. I dont want to be blocked with GPU complexity right now.
Integration Questions
- How should GIS/Weather outputs integrate with documentation workflows?
I will be using the GIS/Weather In CTO mode only. I will also be using documentation in CTO mode with it.
I think, for now, they can be siblings but not have strong integration.
ANSWER: GIS/Weather and documentation containers will operate as siblings in CTO mode, with loose integration for now.
- Do you need persistent data storage within containers?
I do not think so. I will use docker compose to pass in directory paths .
Oh I will want to push finsihed data to minio buckets.
I don't know how to best architect my ETL toolbox.... I will mostly be doing ETL on GIS/Weather data but I can see also needing todo other business type ETL workflows in COO mode.
ANSWER: Use Docker compose volume mounts for data input/output. Primary output destination will be MinIO buckets for business use. ETL toolbox should handle both GIS/Weather (CTO) and business (COO) workflows.
- What level of integration with existing documentation containers is desired?
ANSWER: Sibling relationship with loose integration. Both will be used in CTO mode but for different purposes.
- Are there specific deployment environments to target (local, cloud, edge)?
Well the ultimate goal is some data sets get pushed to minio buckets for use by various lines of business.
This is all kind of new to me. I am a technical operations/system admin and easing my way into devops/sre and swe.
ANSWER: Primarily local deployment (workstation for prototyping, large servers for production). Data output to MinIO for business use. Targeting self-hosted environments for full control and privacy.