Giter Site home page Giter Site logo

haghish / github Goto Github PK

View Code? Open in Web Editor NEW
96.0 12.0 41.0 6.56 MB

a module for building, searching, installing, managing, and mining Stata packages from GitHub

Home Page: http://haghish.github.io/github/

Stata 96.53% R 3.47%
stata stata-command installer package package-manager package-tracker search-engine

github's Introduction

I am a researcher at the University of Oslo, specializing in artificial intelligence applications for mental health, particularly, adolescent suicide attempts and violent extremism. My focus is on developing machine learning algorithms tailored to predict rare outcomes, i.e., modeling outcomes under severe class imbalance. My research extends to enhancing machine learning transparency, particularly in conceptualizing and stabilizing feature importance. Additionally, I am also interested in advancing statistical methods in missing data imputation and to this end, I have developed a machine learning imputation algorithm for single and multiple imputation that outperforms common statistical procedures.

I use GitHub mostly for software development in R, Stata, and Python. Below is a list of free software I've developed. Almost all of my Python packages are developedd for the industry and thus are not publically available. Feel free to contact me for feedback or ideas regarding my algorithms and packages. For updates on my software, follow me on Twitter: @haghish.

R packages

I have written multiple R packages for artificial intelligence as well as general statistical use. My recent software particularly focuse on machine learning, for example, missing data imputation with machine learning, developing automated stacked ensemble machine learning models for classification under severe class imbalance, toolkits for comparing different properties of machine learning models, as well as innovative procedures for assessing model transparency and classification fairness.

Name Description
shapley Weighted Mean Shapley Values with Confidence Intervals for Machine Learning Grids and Stacked Ensembles
mlim Single and Multiple imputation with automated machine learning
autoEnsemble An AutoML Algorithm for Building Homogeneous or Heterogeneous Stacked Ensemble Models by Searching for Diverse Base-Learners
fair Machine Learning Fairness Evaluation and Classification Parity Testing
adjROC ROC Curve Evaluation at a Given Threshold
h2otools Machine Learning Model Evaluation for 'h2o' Package
DOT An R Package that Renders and Exports Graphviz DOT diagrams in SVG and PNG format
convertGraph An R package for converting graphical files to one another
md.log A Markdown log system with function call

Stata packages

Name Description
rcall Seamless interactive R in Stata. rcall allows communicating data sets, matrices, variables, and scalars between Stata and R conveniently
markdoc A literate programming package for Stata which develops dynamic documents, slides, and help files in various formats
github a module for building, searching, installing, managing, and mining Stata packages from GitHub
machinelearning A Stata module for machine learning algorithms, implemented within R using rcall package
diagram diagram : Graphviz and DOT Path Diagrams in Stata
weaver A Stata Log System in HTML or LaTeX for Dynamic Document and literate programming in Stata
neat A Stata layout module for creating geometric shapes out of replicated observations in Stata scatter plots
statax JavaScript and LaTeX Syntax Highlighter for Stata
md2smcl A Stata module that converts Markdown to SMCL language
colorcode A Stata module to return RGB, CMYK, and HSV values for Stata colors

Python packages

Name Description
chase evolutionary psychology experiment designed in a 2D video game form

github's People

Contributors

bquistorff avatar haghish avatar lisshall avatar remlapmot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

github's Issues

MAKEDLG: question regarding TOC and PKG generation

Is it possible to pull files from a sub directory within my GitHub repository.

My stata ado files are organized in the following structure

.../src/a
.../src/e
.../src/_

I would like to generate a pkg which would allow me to files from sub directories within my REPO.
Is that possible? or all fields need to be in the root directory?

I have used the following syntax to make my TOC and PKG:

makedlg adecomp, ///
replace toc pkg title("ADECOMP: Stata module to estimate Shapley Decomposition by Components of a Welfare Measure"') /// version("1.1") license("MIT") author("Joao Pedro Azevedo") affiliation("World Bank") /// url("http://www.worldbank.org/en/about/people/j/joao-pedro-azevedo") /// email("[email protected]") /// install(""C:\Users\wb255520\Documents\myados\adecomp\src\a\adecomp.ado" "C:\Users\wb255520\Documents\myados\adecomp\src\a\adecomp.sthlp" "C:\Users\wb255520\Documents\myados\adecomp\src__ex_adecomp.ado""') ///
ancillary(`""C:\Users\wb255520\Documents\myados\adecomp\src\e\exdata_adecomp.dta""')

Best regards,

JP

Can it support the use of a proxy?

Due to certain reasons, I often encounter network issues that prevent me from completing package installations using this command.

remote connection failed

I'm unsure how to resolve this. Can it support proxy settings or automatically use the default proxy settings?

outdated gitget list

Hi!
Your list of packages installable via gitget seems to be outdated (last update seems to be January). This package is missing for instance https://github.com/CTU-Bern/btable. Can you update it? Or, what do we have to do to get btable into your list?
Thanks!

githubhot.ado not found

I am trying to install github on my new computer's Stata 15, and am getting the error "githubhot.ado" not found.

Error 133 unknown function ustrlower()

github command is throwing and error all the time when I try to install something and I cant figure out why.

. github install haghish/markdoc, version(3.9.6) replace 
unknown function ustrlower()
r(133);

Stata version 13.1
Windows 10 professional.

Materials by `author'

github/make.ado

Line 107 in eb256ab

file write `tocfile' "d Materials by `author'" _n

Sorry if I am misunderstanding here, but I am not sure what Materials means here. Was it supposed to be the name of the package instead?

Thanks!

No download source code option

When I use github install username/repository, stable or github install username/repository, version(), It will additionally download the source code to the current directory except the .ado files to the plus directory. I want to find a option to prevent it, but I have no way to go, so I have to delete the source code in current directory by hand.

It would be great if this Stata command(github install) have a option to achieve this function automatically.

Ps, My Stata version is MP 15.1.

Thanks
Meiting Wang

github update user/repro causes error

Trying to run the update routine causes following error using version 1.4.1

package was not found
file https://api.github.com/repos//contents not found

The program is aware of the pkg being installed on my system but refuses to update it.

. github list

Date Name user/repository Action

23 Nov 2018 estimates_table_docxglennsandstrom/est~dupdate / uninstall

`github` doesn't add itself to the package list

I did a fresh install of github and used it to install rcall. During the installation process its dependency.do queried for the github version, which isn't in the dataset, giving the following error:

. quietly github version github
github package was not found

I'm not sure if this is a rcall or github issue.

force or replace option

Is there any way I can have the github ado replacing the existing the version of the user-written version of the files? when using net install we have the three options.

Possible things to do:

    1.  Forget it
        (best choice if any of the above files were written by you and just
        happen to have the same name or you do not want the originals changed)

    2.  Look for an already-installed package of the same name
        (which you might then choose to uninstall)

    3.  Search installed packages for the duplicate file(s) by clicking
        on the file names above

    4.  Force installation replacing already-installed files
        (if this is an update, this would be a safe choice; you will end
        up with the original and the update apparently installed, but it
        doesn't matter; you can even uninstall the original later)

It would be great to have the same functionality in the github ado

HTTPS problem with Stata 15.1 (only problem with github search)

Hello,

I am running the github ado in Stata 15.1. and I keep getting the follow error message when I use the search option

. github search azevedo
the GitHub API is not responsive right now. Try again in 10 or 20 seconds. this can happen if you search GitHub very
frequent... r; t=0.19 14:29:08

The options query, install and uninstall work fine in the same machine on the same Stata instalation.

The option search works fine in my personal desktop running Stata 14.

Do others have this problem? Is there a fix?

Best regards,

JP

tars&order=desc&per_page=50" `apifile', replace
= capture qui copy "https://api.github.com/search/repositories?q=azevedo+language:stata+in:name,description&sort=star
s&order=desc&per_page=50" C:\Users\wb255520\AppData\Local\Temp\ST_9c58_000003.tmp, replace

  • if _rc != 0 {
  • di as err "{p}the GitHub API is not responsive right now. Try again in " "10 or 20 seconds. this can happen if you

search GitHub very frequent..."
the GitHub API is not responsive right now. Try again in 10 or 20 seconds. this can happen if you search GitHub very
frequent... - exit ------------------------------------------------------------------------------------------------
end githubsearch {hline} - exit

command abspath is unrecognized

After calling the dbmake command and filling in the options, Stata runs the makedlg command and I get the following error:

command abspath is unrecognized

A quick google search showed me that your package markdoc includes the abspath command, so I installed it with gitget markdoc and makedlg ran without issue. You might want to fix this dependency.

Thank you for your work!

Package located in subfolder

I have a small question:

In all of my packages, I have the installation files in the src subfolder (example). This is partly because there are often many files, including test file, docs, etc. so it helps to keep things organized.

Is there a way to install from a given subfolder?

Thanks!
Sergio

Option to delete extracted folder after install

Using github to install a particular version of a package, e.g. rcall, (on Windows, Stata 15) leaves behind an rcall-X.X.X folder extracted from the zip file. It would be good to have an option to delete this after installation, especially as this folder has the dynamic element of the version (e.g. "stable" -> "3.0.7"). If I'm using github in a script, it would be hard for the script to know the version always and delete the folder programmatically. It might even be good to have this option by default.

HTTPS compatibility issue

Older versions of Stata do not support HTTPS and some of the more recent versions (e.g. 13, with no updates) in principle do support HTTPS but are buggy and it does not work.

You could consider implementing something like follows (this is for Windows) to overcome the fact that not all Stata versions can install over HTTPS

shell powershell -command "& {iwr https://github.com/haghish/Weaver/archive/master.zip -OutFile weaver.zip }"
unzipfile weaver.zip
net install weaver, force from("H:/Weaver-master")

erase weaver.zip
!rmdir H:\Weaver-master  /s /q

This is a piece of code that I actually use ;)

Mikko

Option to download files

I have been work in a package but I have a lot of files that have the same name as files already installed in PLUS directory. Therefore, when I use github install <my repository> these files are not installed. It would be nice to have an option to force download and replace these files to the ones in repository.

Some thoughts

Hi E.F.,

This looks incredibly useful. A few thoughts that I hope could be useful:

  • I've had trouble on the past with packages installed from multiple places (e.g. from SSC, Github, local folders). There is a file called stata.trk in the ado/plus folder that often gets corrupted if you install a package from multiple places: on the second install, it fails to detect that it has already been installed from somewhere else; this makes the ado uninstall ... command fail for that package, and can also create issues if the files listed in both versions differ. Once the trk file is corrupt, you need to delete the entire plus folder (or if you are lucky and spotted it right away, replace it with backup.trk).
  • A way to prevent this whole mess is to run this line cap ado uninstall <package> right before you try to install it.
  • It might also be useful to generalize it or create a wrapper that allows github/ssc/local installs with the same commands. I often want to download a stable version (SSC), or just build a file locally, so having the same command for these ops helps a lot.
  • Dependencies are incredibly useful, but I'm wondering how to deal with having more than one package in the same location. Maybe allow for packagename_dependency.do? (this might also help for packages that get published in SSC or in a centralized repo)

Cheers!
Sergio

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.