Jacob Initial Post
by Jacob Dichter
July 26, 2024
Test post.
import sys
import numpy as np
import pandas as pd
import math
##---Return products of Gaussian probabilities for each feature
def compute_gaussian(x, mean, var):
res = 1
for i in range(0, len(x)):
exp = math.exp(-((x[i]-mean[i])**2 / (2 * var[i] )))
res *= (1 / (math.sqrt(2 * math.pi * var[i]))) * exp
return res
### ---- IRIS DATASET ---- ###
def main():
##---Load in the Iris dataset using Pandas read.csv
df = pd.read_csv("C:/Users/jacob/Documents/M.S. COMPUTER SCIENCE/Spring 2023/Data Mining/Assignment 2/iris.csv")
#---Randomize data and assign randomized data into dfrandom
#--.sample shuffles ourdata
#--.reset_index creates a new index for the shuffled data and drops the old index
dfrandom = df.sample(frac=1, random_state=1119).reset_index(drop=True)
print(dfrandom)
# data read from a file is read as a string, so convert the first 4 cols to float
df1 = dfrandom.iloc[:,0:4].astype(float)
print(df1)
#---separate out the last column
df2 = dfrandom.iloc[:,4]
#---combine the 4 numerical columns and the ast column that has the flower category
dfrandom = pd.concat([df1,df2],axis=1)
print(dfrandom)
#---separate the data into training and test parts
dftrain = dfrandom.iloc[0:100,:]
print(dftrain)
dftest = dfrandom.iloc[100:,:]
print(dftest)
mkdir enguno
cd ..
cd ..
mv assets/style/* ./
The below chart shows the number of messages sent by Daniel and me over the decade preceding 2023. We can observe a low period when we lived together 2016 into 2017, moving to a max around when we were working full-time in office or remote. Finally, the chart declines when we both met our girlfriends and began relationships.
Notes from old Home Page:
Gaussian Mixture Model
\(p(x) = \sum_{k=1}^K \pi_k \cdot \mathcal{N}(x \mid \mu_k, \Sigma_k)\)
A few suggestions:
- Simple links as above. Can be short write-ups or just links to Kaggle notebooks or hosted Jupyter notebooks. I would suggest something like Chris Khan’s hosted notebooks.
A few more Markdown notes:
Astericks can be used (1 *) for emphasis (2 **) and other formatting choices (2 ~~)
Unordered lists can be indicated with * or - or +
- I think
- Minus
- Plus
We can also use inline emphasis with single backticks.
Code blocks can be inserted with triple ``` and optionally the name of the language right after opening:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data = pd.DataFrame({'Brand': ['Maruti', 'Hyundai', 'Tata',
'Mahindra', 'Maruti'],
'Year': [2012, 2014, 2011, 2015, 2012],
'Kms Driven': [50000, 30000, 60000, 25000, 10000],
'City': ['Gurgaon', 'Delhi', 'Mumbai', 'Delhi', 'Mumbai'],
'Mileage': [28, 27, 25, 32, 26]})
LaTeX testing:
Block math formatting:
\(E = mc^2\)
Linear regression
\(y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_p x_p + \epsilon\)
Normal distribution
\(f(x \mid \mu, \sigma^2) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2 \sigma^2}\right)\)
