asad70 / insider-trading Goto Github PK
View Code? Open in Web Editor NEWThis program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
License: MIT License
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
License: MIT License
No tables found
Retreiving insider trades for the symbol "aal" failed. Skipping...
Hello, I am trying to get this program running but I ran into a problem. when I execute the program it asks me for the tickers, the date and the filename to extract too. I did that and then immediately after it gives me the following code
Traceback (most recent call last): File "C:\Users\jimmi\Desktop\Python Programs\Stock_Web_Scraper.py", line 203, in <module> data_df(all_df) File "C:\Users\jimmi\Desktop\Python Programs\Stock_Web_Scraper.py", line 139, in data_df all_data, headers = insiders(all_data, cik) File "C:\Users\jimmi\Desktop\Python Programs\Stock_Web_Scraper.py", line 92, in insiders for i in report.children: AttributeError: 'NoneType' object has no attribute 'children'
This is the code I am using:
`import requests
from bs4 import BeautifulSoup
import pandas as pd
import datetime
import csv
fName = 'ticker and cik.csv' # set the ticker/CIK file name
data_dict, symbols = {}, []
with open(fName, 'r') as file:
reader = csv.reader(file)
for row in reader:
data_dict[row[0]] = row[1] # creating a dict ex. format: 'msft': '789019'
symbols.append(row[0])
'''taking user input'''
ticker = input("Enter a ticker (ex: 'AAPL, MSFT') or type 'all' to search thru all the tickers in file: ").replace(" ", "")
ticker = ticker.lower()
if ticker == 'all': symbols = symbols
else:
tickers = ticker.split(",")
symbols = tickers
start = input("Enter the starting date (Ex: 2020-MM-DD): ")
start2 = datetime.datetime(int(start[:4]),int(start[5:7]),int(start[8:]))
end_date = datetime.date.today()
extract = input("Would you like to extract data to excel file (Press enter for no OR enter filename): ")
print()
def transaction(url):
'''gets the transaction report from url
Parameter: url: string
url for data extraction
Return: trans_report: soup object
transaction report
'''
response = requests.get(url)
web = response.content
soup = BeautifulSoup(web, 'html.parser')
trans_report = soup.find('table', {'id':'transaction-report'})
return trans_report
def insiders(all_data, cik):
'''gets the insider info of given cik number
Parameter: all_data: list
empty list to add data
cik: string
cik number
Return: all_data: list
list of all the data
headers: list
list of the headers "Acquistion or Disposition...."
'''
num = 0
url = f'https://www.sec.gov/cgi-bin/own-disp?action=getissuer&CIK={cik}&type=&dateb=&owner=include&start={num}'
urls, c = [], 0
urls.append(url)
for url in urls:
c2, headers, data = 0, [], []
report = transaction(url)
for i in report.children:
if i != '\n':
collection = i.get_text().split('\n')
c2 += 1
for x in collection:
if x!= '': # takng out empty data
if len(headers) != 12: # 12 headers
headers.append(x)
else:
x = x.replace('$','')
# checking date boundary
if c == 1 and start > x: # @ c==1, x is date
return all_data, headers
elif (x == 'A' or x == 'D'): # start of new row
if c != 0:
all_data.append(data)
data, c = [], 0 # new row: clear data
data.append(x)
c += 1
else:
data.append(x)
c += 1
if c2 == 81:
all_data.append(data)
num += 1
url = f'https://www.sec.gov/cgi-bin/own-disp?action=getissuer&CIK={cik}&type=&dateb=&owner=include&start={num*80}'
urls.append(url)
return all_data, headers
def data_df(all_df):
'''gets the insider info of given cik number
Parameter: all_df: list
list of all the data frame
Return: None
'''
done = 0
for symbol in symbols:
all_data = []
# key error handling
try: cik = data_dict[symbol]
except Exception as e:
print(f"The symbol {e} is not valid, rest of the data will be saved to excel file (if available)")
if symbol == symbols[-1]: break
else: continue
all_data, headers = insiders(all_data, cik)
# adding to data frame
df = pd.DataFrame.from_records(all_data, columns=headers)
done += 1
print(f"Finished extracting {symbol.upper()} insider data from {start} till {end_date}.")
print(f"Finished: {done}/{len(symbols)} symbols.")
df['Purchchase'] = pd.to_numeric(df['Acquistion or Disposition'].apply(lambda x: 1 if x == 'A' else 0) * df['Number of Securities Transacted'])
df['Sale'] = pd.to_numeric(df['Acquistion or Disposition'].apply(lambda x: 1 if x == 'D' else 0) * df['Number of Securities Transacted'])
name = 'Transaction Type'
sell = df['Transaction Type'].str.count("S-Sale").sum()
buy = df['Transaction Type'].str.count("P-Purchase").sum()
sale = df['Acquistion or Disposition'] == 'D'
purch = df['Acquistion or Disposition'] == 'A'
num_purch = len(df[purch])
num_sale = len(df[sale])
total_sale = int(df['Sale'].sum(skipna=True))
total_purch = int(df['Purchchase'].sum(skipna=True))
# adding data to separate df for excel
symbol_df = pd.DataFrame({'Symbol': symbol.upper(),
'# of Purchases': num_purch,
'# of Sales': num_sale,
'Total Bought': f'{total_purch:,}',
'Total Sold': f'{total_sale:,}',
'S-Sale count': f'{sell}',
'P-Purchase count': f'{buy}'},
index = [1])
# handling division by zero error/adding data to separate df for excel
try:
avg_sale = int(total_sale/num_sale)
avg_purch = int(total_purch/num_purch)
ratio = round(num_purch/num_sale, 2) # purchase to sell ratio
except ZeroDivisionError as e:
print(f"\n{e} error for '{symbol}' there isn't much data available to calculate avg sale/purch & ratio, data will be added to excel without these metrics")
else:
symbol_df.insert(3, 'Buy/Sell Ratio', ratio)
symbol_df.insert(6, 'Avg Shares Bought', avg_purch)
symbol_df.insert(7, 'Avg Shares Sold', avg_sale)
symbol_df.set_index('Symbol', inplace=True)
all_df.append(symbol_df)
def excel(all_df):
'''adds the data to excel sheet, data is saved in same floder as this file
Parameter: all_df: list
list of all the data frame
Return: None
'''
try:
if len(extract) != 0:
pd.concat(all_df).to_excel(extract +'.xlsx', index = True)
print(f"Extracted the data to {extract}.xlsx\n")
except Exception as e: print(e)
if name == 'main':
all_df = []
data_df(all_df)
excel(all_df)`
Thankyou in advance for the help.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.