PORTFOLIO

API & WEB SCRAPING

Collecting Data From External Sources Through Automated Pipelines

Accessing external data requires navigating layered structures, routes, endpoints, and web elements that are not always designed for direct extraction.

The circuit-like visual represents the journey through HTML, selectors, API endpoints, and response parsing—turning complex structures into clean, usable datasets.

This section includes projects where I collect, clean, and integrate data from APIs or websites through automated extraction pipelines.

Collecting external data by navigating web structures and programmable interfaces.

Accessing external data often requires navigating layered web structures, endpoints, and formats that are not designed for direct analysis. These projects focus on extracting data from websites and public APIs using Python, transforming fragmented and heterogeneous sources into clean, structured datasets ready for exploration or automation.

By working with static and dynamic web pages, RESTful APIs, and multiple data formats such as JSON, CSV, and XML, I build automated extraction pipelines that handle real-world challenges like pagination, authentication, rate limits, and dynamic content. Each project emphasizes reliability, scalability, and clean downstream integration—ensuring that externally sourced data can be effectively analyzed, visualized, or incorporated into broader analytical workflows.

Related Projects

Weather Data Fetcher

This project retrieves real-time weather data from the Open-Meteo API and stores it in a structured CSV format for further analysis. Configured for Buenos Aires, Argentina 🇦🇷, it captures key variables such as temperature, wind speed, weather codes, and timestamps, focusing on clean data extraction and reusability.

Crypto Price Tracker

This project collects Bitcoin price data from the CoinGecko API over the last seven days, stores the historical prices in a CSV file, and visualizes the trend through a line chart. The goal is to demonstrate API integration, time-series data handling, and basic data visualization.

News Headlines Scraper

This project uses Python and BeautifulSoup to scrape the latest news headlines from the Infobae homepage. It extracts headline titles, records the extraction timestamp, and stores the results in a CSV file, demonstrating automated web data collection and basic text data structuring.