This project is a Python-based Jupyter Notebook that executes a full data science workflow: collecting hourly weather data for Pasay City from Weather Underground, cleaning and preparing it for analysis, visualizing it in PowerBI, and building a machine learning model to predict temperature.
get_data(month, day)
function uses Selenium to navigate a Chrome instance to the correct URL for each day. After a pause to allow JavaScript rendering, BeautifulSoup parses the page source (driver.page_source
) to find the main observation table and iterates through its rows (<tr>
) and cells (<td>
) to extract data points into lists..dropna()
, and removing anomalous rows where atmospheric pressure is “0”.LinearRegression
model from Scikit-learn is used. Categorical features like ‘Wind’ and ‘Condition’ are numerically encoded. The data is split into training and testing sets using train_test_split
, and the model is trained on the training data.