ICPSR

This repo supports ICPSR NLP & Text Mining 2024 Summer Topic Workshop.

Software requirements

This course requires R, R-studio both freely available.

Additionally on day 3, we will explore programmatic requests to "local LLMs". This requires LM-Studio. However students with Intel Macs cannot use this software. Only Mac M1,2,3, Windows and Linux machines are supported. Students with this issue should install Ollama as an alternative. If you use Ollama instead of lm-studio, you will have to adjust our class code based on the example below and possibly install a user interface such as webui

Students with older computers, old GPUs or a small amount of RAM, may not be able to execute this portion of the lesson

For students needing Ollama (less preferred), download it then perform the tasks at the end of this readme.

Lessons

Day1:

R Setup & Logistics
What is NLP, git, r syntax, r-studio?
Preprocessing steps: string manipulation, term frequency, Chapter 1

Day2:

Bag of Words DTM/TDM
Visualizations: wordclouds, histograms, pyramid plots, word networks, dendrograms, associations, dendrograms - “Homework” HW1-Basics of R Coding

Day3:

Basic sentiment analysis with lexicons
Basic document clustering
LLM Basics, prompting & (time permitting) prompt chains/agentic workflows

Packages for R

R is customized for specific functions using libraries or packages. In this class we will use the following packages. Once you have R and R studio installed run the following command in your console. Don't worry if you struggle, on day 1 we will set aside time to help though we aren't performing technical support.

# Install library pacman
install.packages('pacman')

# Use pacman to install other libraries)
pacman::p_load(dplyr, ggplot2, ggthemes, igraph, networkD3, qdapRegex, slam, stringi, stringr, tm)

Please install lm studio. If you have to install ollama instructions are below.

For students unable to use LM Studio here are some set up and testing instructions for Ollama.

In terminal

run ollama

Install a small llm for testing, takes a few minutes.

ollama run gemma:2b

You will see a prompt in your terminal like this. You can ask a simple question, "What is the capital of France?" in the terminal.

>>> Send a message (/? for help)

>>> What is the captial of France?

Next, let's perform a programmatic request while Ollama is running. Open R and try this code. If you get a response in R, your instance is working as intended. You will have to adjust our class code to fit this example API request which is slightly different. For additional help, this is a great site to help convert CURL requests to multiple languages.

# Libraries
library(httr)
library(jsonlite)

# Inputs
prompt <- "What is the capital of France?" 

# API call inputs
headers <- c(`Content-Type` = "application/json") 
data    <- list(model = "gemma:2b", # Be sure to change to the model name you're using
                prompt = prompt)

# API Request
res <- httr::POST(
  url = "http://localhost:11434/api/generate", 
  httr::add_headers(.headers=headers), 
  body = jsonlite::toJSON(data, auto_unbox = TRUE), 
  encode = "json")

# Parse the streaming in JSON
llmResponse <- httr::content(res,as = "text", encoding = "UTF-8")
llmResponse <- strsplit(llmResponse, "\n")[[1]]
llmResponse <- lapply(llmResponse, fromJSON)
llmResponse <- paste(unlist(lapply(llmResponse, '[', 'response')), collapse = '')
llmResponse

To exit Ollama in terminal run this command. The API will still be running so you could still execute step 4.

>>> /bye

To stop Ollama in terminal run this command. The in the upper toolbar, there is a llama icon. You have to click the icon and stop running it

brew services stop ollama

If that doesn't work try running in terminal. This will return a number.

$ pgrep ollama

To kill that process take the number presented and use this command in terminal.

kill 74877

To fully uninstall and remove Ollama use this site, though commands for mac are slightly different and are cited here.

kwartler / icpsr Goto Github PK

icpsr's Introduction

ICPSR

Software requirements

Lessons

Day1:

Day2:

Day3:

Packages for R

Please install lm studio. If you have to install ollama instructions are below.

For students unable to use LM Studio here are some set up and testing instructions for Ollama.

icpsr's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent