This project is a Python-based Jupyter Notebook that executes a full data science workflow: collecting hourly weather data for Pasay City from Weather Underground, cleaning and preparing it for analysis, visualizing it in PowerBI, and building a machine learning model to predict temperature.
get_data(month, day) function uses Selenium to navigate a Chrome instance to the correct URL for each day. After a pause to allow JavaScript rendering, BeautifulSoup parses the page source (driver.page_source) to find the main observation table and iterates through its rows (<tr>) and cells (<td>) to extract data points into lists..dropna(), and removing anomalous rows where atmospheric pressure is “0”.LinearRegression model from Scikit-learn is used. Categorical features like ‘Wind’ and ‘Condition’ are numerically encoded. The data is split into training and testing sets using train_test_split, and the model is trained on the training data.