Giter Site home page Giter Site logo

aliiftikhar / sodi Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 156 KB

SODI (StackOverflowDataImport) is a simple Windows Executable program (.NET Console .exe) that imports StackOverflow data dump (.xml) into your choice of Database. At the moment it supports MS SQL Server, but it is designed for extension. Please feel free to add more Database support (e.g. MySql, Redis etc) all you will need to do is to implement IDatabaseRepository and configure DatabaseRepositoryStrategy to return your newly added Repository when requested.

C# 100.00%

sodi's Introduction

SODI

SODI (StackOverflowDataImport) is a simple Windows Executable program (.NET Console .exe) that imports StackOverflow data dump (.xml) into your choice of Database. At the moment it supports MS SQL Server, but it is designed for extension. Please feel free to add more Database support (e.g. MySql, Redis etc) all you will need to do is to implement IDatabaseRepository and configure DatabaseRepositoryStrategy to return your newly added Repository when requested.

How to use SODI:

Step 1: Get StackOverflow database files:

Download StackOverflow database files using this torrent link: https://archive.org/download/stackexchange/stackexchange_archive.torrent

You may use BitTorrent: http://www.bittorrent.com/ to download files from the above torrent link.

Unzip the files so you end up with: Posts.xml, Comments.xml, Votes.xml, Badges.xml, Users.xml and Tags.xml

Step 2: Download SODI from Github

Step 3: Configuring SODI

In the SODI.Console project set the following appSettings in App.config

XmlSourceFolderPathIncludingLastSlash = [Folder location of your downloaded xml files]

DatabaseType = SqlServer

DatabaseServerName = [Your Database Server name or IP address]

DatabaseName = [Name of the new Database that will be created as StackOverflow database]

DatabaseUsername = [Your database server username]

DatabasePassword = [Your database server password]

Step 5: Run the Executable

Build the solution and Run SODI.Console.

Expected Result:

The console app will create the new Database by the name you specified in appSettings’ DatabaseName

A table for each xml file will be created with correct columns and data types

Each table will have a Primary Key: an ID column.

Depending on the size of XML files, this can be a very quick process (in case of StackExchange.Programmers xml files) or can be a long process (in case of StackOverflow xml files)

Console will keep updating as soon as each xml file is processed. The files will get processed in this order: Tags, Users, Badges, Votes, Comments and Posts

Contributing to SODI: Adding more Database Server support

SODI is designed to be extensible .To add support for other Database Servers, e.g. MySQL you need to provide an implementation of SODI.Contracts.Repositories.IDatabaseRepository

The existing SQL Server implementation is located at SODI.Repositories.DatabaseRepositories.SqlServer

If the new implementation is for MySQL, I would expect that to be located at: SODI.Repositories.DatabaseRepositories.MySql

A new entry for MySQL will be added in SODI.Models.DatabaseType.cs which is an enum of database types.

The method GetSupportedDatabaseTypeRepositories() in SODI.Repositories.DatabaseRepositories.DatabaseRepositoryStrategy.cs will have to be modified to add newly created MySqlRepository into the dictionary of databaseTypeRepositories

Finally in the appSettings of App.Config the DatabaseType will have to match the name specified in the enum: SODI.Models.DatabaseType.cs

sodi's People

Contributors

aliiftikhar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.