The data given is of the mutual funds in USA. The objective of this problem is to predict the ‘basis point spread’ over AAA bonds i.e. feature ‘bonds_aaa’ against each Serial Number. Basis Point Spread indicates the additional return a mutual fund would give over the AAA rated bonds.
After completing this project, I had a better understanding of how to apply linear model using GridsearchCV. Chi square contengency test Box plot Linear regression GridsearchCV Ridge and Lasso Regressor
About the Dataset The data given is of the mutual funds in USA. Following is the brief description of the features in this data
Fund Symbol: Uniques symbol for the mutual fund used for representing it on the bourses Fund Name: Full name of the mutual fund scheme Category: Investment category of the mutual fund Fund Family: Asset management company to which mutual fund belongs to Investment: Type of investment of the mutual fund scheme Size: Size of the mutual fund based on the total net assets Total net assets: Total assets under management for the mutual fund scheme Currency: Currency in which the investments of the mutual fund are held Net Annual Expense Ratio: Expense ratio is the fee that the asset management company charges to the clients as a percentage of the total assets. Morningstar Rating: This is the overall fund rating given by the rating agency Morning Star. The rating is on the scale of 1 to 5 where 5 is the best. Inception Date: The date on which the mutual fund scheme was started. portfolio: percentage of total assets invested in the investment instrument. sectors: percentage of equity assets invested in the sector Morningstar Return Rating: Fund rating based on returns by the rating agency Morning Star.The rating is on the scale of 1 to 5 where 5 is the best. Returns_ytd: Year to date return of the mutual fund. retruns: Annual return of the mutual fund for the respective year Morningstar Risk Rating: Fund rating based on risk of the mutual fund by the rating agency Morning Star.The rating is on the scale of 1 to 5 where 5 is the best. Alpha 3y: 3year average alpha of the mutual fund. Beta 3y: 3year average beta of the mutual fund. Mean Annual Return 3y: 3year mean annual return Standard Deviation 3y: Standard deviation of returns over three years. Sharpe Ratio 3y: 3year average Sharpe ratio of the mutual fund. bonds_*: Basis point spread over the bonds for the mutual fund. The original data contains many non numerical features and missing values. We will be learning how to handle these cases in future concepts, for this project we have processed the data for us so that we can concentrate on building model.