Giter Site home page Giter Site logo

rjp43 / cityslavegirls Goto Github PK

View Code? Open in Web Editor NEW
5.0 9.0 4.0 196.51 MB

The Restoration of Nell Nelson

Home Page: http://nelson.newtfire.org

HTML 97.88% JavaScript 0.17% XSLT 1.55% CSS 0.35% PHP 0.01% XQuery 0.05%
history digital-humanities xml newspapers chicago xslt html nell-nelson chicago-times css

cityslavegirls's People

Contributors

brookestewart avatar codykarch avatar ebeshero avatar kariwomack avatar nlottig94 avatar rjp43 avatar spadafour avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cityslavegirls's Issues

What is done:

@spadafour

Article 1888-08-08 is in the worst condition. The transcription is incomplete and what is transcribed is questionable. We will keep this one out of any data visualizations until we can review and fix it. --transcription and minimal coding completed by Cody

All of the following articles have been reviewed and all of the quotation marks have been replaced either with the dialogue <said> elements or the other <hi> element.

These articles do not have the completed new grammatical markup and the <persName>'s, <orgName>'s, ``'s, `'s have not been reviewed for accuracy.

  1. Article 1888-08-09 -- transcription and coding completed by Kari (transcription complete)
  2. Article 1888-08-10 -- transcription and minimal coding completed by @brookestewart (transcription incomplete)
  3. Article 1888-08-12 -- transcription and minimal coding completed by @nlottig94 (transcription incomplete)

These articles have the completed new grammatical markup and the <persName>'s, <orgName>'s, ``'s, `'s have been reviewed for accuracy.

  1. Article 1888-07-30
  2. Article 1888-07-31
  3. Article 1888-08-01
  4. Article 1888-08-02
  5. Article 1888-08-03
  6. Article 1888-08-04

This article has the completed new grammatical markup and the <persName>'s, <orgName>'s, ``'s, `'s have been reviewed for accuracy. This article also has the versioning comparison to the Barkley text.

Article 1888-08-19

12-1 Class Meeting

@RJP43 accomplished:

  • Added <p> tags to 7/30 and 7/31 (this article is incomplete)
  • Proofread 7/30 and 7/31
  • Added <persName> <placeName> <orgName> <placeName type="address">
  • Fixed all of the <unclear> -- now consistent with TEI (for 7/30 and 7/31)

Needs to add:

  • @spadafour add schematron to web development folder and fix 8/02 and /03
  • @RJP43 adding dialogue tags for 7/30 and /31 (and finishing 7/31)
  • @CodyKarch do CSS for entire site and article 08/08
  • @KariWomack do articles 08/04 and /09

For fixing the assigned articles:

  • Use 7/30 as an example
  • Place <p> <persName> <placeName> <orgName> <placeName type="address"> tags
  • Make an attempt to clarify <unclear> tags (refer to the codebook for this)

Some notes:

  • For exposed companies: <orgName type="exposedCompany"> (Just grab the name, no pseudo-markup)
  • While you are working, add references for both cities and company names and company addresses,do that by adding @ref: example: <orgName type="exposedCompany" ref="#WLMC">Western Lace Manufacturing Company</orgName>

SVG pulled from Collection

@RJP43 @spadafour I've figured out what was going wrong with spacing your SVG bars, and it was a couple of things: the position() within the collection might not be a consistent calculation, so I found another attribute value (your @corresp on <title>) that yields us a reliable numerical value (after we grab it with substring-after() and the number() function). Also, to cluster your little groups of four bars, you don't want to be multiplying those by a number (on top of your multiplication of the xPos) because that spreads your bars too far apart. What you want is an addition there:
bar 1: $xPos * $Interval
bar 2: $xPos * $Interval + $barWidth
bar 3: $xPos * $Interval + (2* $barWidth)
bar 4: $xPos * $Interval + (3* $barWidth)

Does that make sense?
I don't think I'd leave these sitting side by side, but I'd at least stack the male and female bars if they're adding up to 100%. See my comments in the commit (cc109e2) and within your XSLT file.

The tagging saga continues.... Dashes and Quotation Marks

So as I am taking a closer look at a few aspects of the articles that we need to standardize. I am tossing around a few different ways to code these aspects here: @spadafour @ebeshero We will need to discuss how to edit these in past article transcriptions and the possible writing of schematron rules to fire on these for future transcriptions.

Let's first discuss dashes:

  1. Okay so there are plenty of hyphenated words that use the regular hyphen dash − (&minus;) I was thinking we could just incorporate this into our XSLT (that makes the reading view of the articles) so that we transform what we have been typing (the - from the keyboard) into the correct unicode (&minus;) and/or we could require that as transcription continues in the future we put &minus; in place of just hitting the - on the keyboard --- either way this needs to be done to avoid future issues where the browser doesn't recognize the character (and yes we have seen this happen a few times already particularly when Matt was transforming an old article markup for CDV XSLT Exercise 2 his browser was showing an unknown character for every - imputed) ---- My concern is when we put the unicode in the XML transcription oXygen throws the following error : F [ISO Schematron] The entity "minus" was referenced, but not declared. Which is because the TEI doesn't allow for these unicode options right in the text and I think we have to investigate the use of characters/punctuation further. But basically we could mark them in some way and then as a part of the XSLT grab those markers and transform them into the unicode for the HTML output. Here are a few examples (from article 1888-08-19) of when we would be using the regular hyphen (dash): four-story, devil-chaser, Paper-Box
  2. Then we have the extended dash or the em dash which we frequently see used in the Barkley text for censored words (orgNames, streets, and persNames). So the unicode for this is &mdash; which gets us this — and again we have the same issue as before that TEI doesn't just allow for the unicode. Another concern is that we don't have a way to note this longer dash just using the keyboard beside multiple reg. dashes (---) .... So again we need to further investigate how to represent these in the text. Do we care to make the distinction? Would we want to just put one regular hyphen or keep with the three hyphens and then just do the XSLT that creats the unicode for the hyphen (as discussed above) when making out HTML reading view or do we find out a way in TEI to distinguish these two different types of dashes? This is an example (from article 1888-08-19 where we incorporated the Barkley text into an <app><rdg> version-ing setup) of when we would be using the em dash (long dash) : <app><rdg wit="#CT021">34 to 38 East Randolph</rdg><rdg wit="#WSGC23">on R---</rdg><!--rjp: em dash???--></app> street. We also see an example of this in other parts of the articles when speech text gets cut off mid sentence, like this: Can't you make it five? She just dotes on children. If she won't take him I'll be No. 2 and run for the chance. Can't you induce him to call here? We are tailoresses here, but when we appear upon the street we are --- <!--rjp: em dash???-->
  3. So there is one other type of dash that we might consider if we see it pop up (which I have not yet, but if we come up with a TEI system we might consider including it in case it comes up) and that is the en dash (–) &ndash;

Okay my next concern is the formatting of our quotation marks and there are several uses of quotation marks in different contexts for this project that we might mark in different ways. My concerns for finding way to standardize our representation of these is so that we can use the curly quotes (single and double) and apostrophes appropriately and according to David's (@djbpitt) project suggestions from last semester.

  1. My biggest concern is the use of quotation marks around dialogue. So for this I was thinking we should use the TEI element <q> in replacement of the pseudo-markup (quotation marks) in fact we could go through and instead of having the <said> elements we could just use the <q> elements and have our attributes previously on <said> put on the <q>. I have tested this and <q> does accept those same attributes. This would mean changing the said tags in past articles and editing out SVGs and XSLTs accordingly (which seems we will be editing these a lot either way). It would be chunky markup but we could keep the <said> with all of the attributes and have the <q> element sitting inside just replacing the pseudo-markup (quotation marks). There might be TEI reason against this so we would want to verify we aren't breaking TEI rules ( @ebeshero ) if we decide to do it like that.
    This is what we currently have: <said who="#employee" ana="male">"All but him binds packages; he glues."</said>
    Options to change to:
    <said who="#employee" ana="male"><q>All but him binds packages; he glues.</q></said> OR
    <q who="#employee" ana="male">All but him binds packages; he glues.</q> OR
    We make a @rend / @rendtion attribute on either the said or q that points to the use of quotes and we would need to figure that out more using this section of the TEI
  2. Another instance quote are used is for specific phrases or emphasized words. For example in this paragraph we see two points where words are set out in quotes.
    Chicago, Aug. 13. - TO THE EDITOR: One who reads your articles with more than passing interest, and who deeply sympathizes with the cause of honest labor, has sufficient romance in his "make-up" to perform his part in assisting the young lady of brains referred to, and if honesty of purpose, good bringing up, etc., accompany the brains, the lady can find at the head of an honest, temperate, working-man's home a peace and comfort not found in "wearing out her young life" in pursuit of a mere existence. And we see this frequently not just in times when it could be that the person writing the editor is quoting a past article's wording. For example: Nothing short of a Philadelphia lawyer, a Chicago health officer, a proprietor or a "devil-chaser" that hits the spot once in a thousand times could, without a guide, explore the labyrinth that is known as H. Schultz &amp; Co.'s paper-box manufactory... Because we see this multiple use I have been searching these two options in the TEI: <emph> versus <hi> and I think we could get away with just using one or the other and since we cannot be sure the intention of the quotes is emphasis <hi> seems more logical. I would like @ebeshero input on this though. @spadafour you can read more about the difference of the two here to weight in as well. Seems either choice uses the @rend / @rendtion attribute to declare how the emphasis or highlighting is marked. And may need a clarification on the difference of the two (@rend vs. @rendtion)
  3. Sometimes we get a similar marking of specific words in single quotes instead of the double quotes and we should probably figure out a way to separate those out as well so they can be styled and transformed accordingly. Here is an example and this one in particular sits inside of a set of dialogue quotes which I have replaced with the q markup discussed above: <p><q>Are not the 'white slave' articles in THE TIMES somewhat sensational?</q></p>

Finished Website... EXCEPT...

Hi everyone ( @spadafour @KariWomack @CodyKarch @ghbondar @ebeshero ),

Alright so the site is finally finished.... EXCEPT........

there are a few problems I am having once I transferred the site to the web that I'm confused on and looking for help. Everything is synced!

Okay, first thing when I open the index page locally it looks like how I want it to, like this:

capture

but when I open it on the webpage the second image doesn't appear (except for on my phone... curious I know!)

Okay, second thing is that the SSIs aren't working only on the article HTML pages (these pages were generated using this XSLT if that helps) and this is the error I am getting:

ssiissue

it starts at the top of the page and the second line finishes after the article text. And yes I opened the HTMLs and checked if they were valid in oXygen.

I'm tagging @amielnicki and @nlottig94 as well just in case either of them get a chance to look at this and can help

Where are we going?

@spadafour

Let's touch base here...

Where are we in getting the Barkley Publication updated with <p> tags?
What semantic markup have we decided to remove from the articles? And what did we decide to add? (ARE WE DOING VERBS? We want to remove the adjective markup except on archetypes and use of possesive pronouns, right?)
Once the Barkley Publications are complete I think we could do a bulk transcription of the articles by coping over the Barkley text and running through editing the differences? Do we want to do this? Can we think of a clever way to markup up what is added/subtracted? Is that useful markup for what we plan on creating visually?
What do we plan on creating visually?
Let's come up with a solid new research question... and I think we agreed it would be based on versioning?

Transcription and Encoding

The tasks in this issue (upon completion) will count as credit for Project Development for weeks ending 10/30, 11/6, and 11/13.

I would like each of you @spadafour @rCarls @CodyKarch @KariWomack to read one article from the PDF images of the original articles (published in the Chicago Times) and try your hand at transcribing it.

Go here on GitHub and download the TEI header and Structure Template by viewing the "raw" text, right-clicking and downloading/saving the file.
Then, in oXygen, open that file.
Review the TEI Header and edit the portions of the header (<date> inside of <title>... etc.) that describe the individual article you are working with and the <resp> element to specify that you are the one transcribing and coding the article.
Once you have reviewed and edited the TEI header go down to the second <div> that is marked @type="headlines" and transcribe each of the headlines into an <item> element inside of the <list> element.
The main text of the article will be transcribed in the third <div> that is marked @type="articleBody". Be sure to separate the text into <p> elements every time there is a paragraph break. However, do not be concerned with the page breaks because the poor quality of the images makes it difficult to tell when the pages are actually separated.
Note: if there is no advertisement following the article (that references the series) delete or comment out the final <div> element that is marked @type="advertisement".

As you are transcribing if you run into an area where the text is unclear and you can only infer what the text says or if the text is completely unreadable refer to our codeBook on how to properly represent the unclear text.

Refer to this article and our conversation in Issue #6 while encoding to see the kinds of elements, attributes, and attribute values we are using to encode the articles. If you decide there is something you want to encode that we have not already determined a TEI element for feel free to find an appropriate element in the TEI guidelines. If you do this please leave a comment directly after the new element (inside of your XML) and ping the group in this issue with the element and your reasoning behind choosing it.

Mark with comments (that include your name and date) areas of interest and parts of the article that stand out to you or that you have difficulty transcribing/encoding.

When you are finished, push your transcription to the ChicagoTimes_CSG_XML Folder using your desktop client.

Once you have completed your transcription and basic encoding (not an easy task so give yourself sufficient time) go into the Anon_WSG Folder and hunt for the corresponding Barkley Section to the article you just transcribed. The best way to do this is by skimming the headlines of the sections in search of headlines that match those from your article. Read the section or sections (once you find whichever matches your article) and jot down any noticeable differences between the text and headlines of the Barkley section and the article you just encoded/transcribed.

In the future, we will also want to do a similar review of the McEnnis chapters to see where in that text your article is referenced and how it differs from the original article. --- We can't do this just yet because we still need to go through the McEnnis text and transfer it into TEI XML.

Upon completion of these tasks comment in Issue #9 giving us which section from the Barkley text a.k.a. Anon_WSG folder corresponds with the article you transcribed and what the noticeable differences are between the texts/headlines.

This is significant project work helping transcribe, begin basic TEI structure tagging, and the beginning stages of version comparing. This will also give each of you a chance to become better associated with the Nelson project and begin the process of finding interesting things each of you may want to produce data visualizations on for upcoming assignments.

Article Assignments:
@spadafour --- 8/6/1888
@rCarls --- 8/7/1888
@CodyKarch --- 8/8/1888
@KariWomack --- 8/9/1888

(It may be easier to view the PDF images of the files through our source site found on the NYU Digital Library. Simply find your article in the list of articles, click on it, and then click to view the PDF file.)

Please contact me via email or in this issue with any questions and concerns as they occur.
Thank you!

Consistent Coding

@KariWomack @CodyKarch @spadafour @ebeshero

I am going to make a list here of the elements, attributes and values we should be using and the descriptions associated so that we are making sure to code consistently. @spadafour This will also, hopefully, be useful in the writing of the schematron. Everyone please refer to Issue #15 for the xml:ids of the companies so that we also keep those consistent between articles and be sure to add to that issue when you encounter companies you have to make an xml:id for if it isn't listed there already.

the structural elements are <head>, <div>, and <p>

  • the <head>element contains the element <title> only
  • the <div> element has the attribute @type which is equal to one of the following values "headlines", "articeBody", or "advertisement"
    *inside of the <div type="headlines"> there is the additional structural markup of <list> and <item> with each <item> holding the individual headline sentences. There are no <p> elements inside of this <div>.
  • the other two div types should have <p> elements separating the paragraphs as they are represented in the original texts.

contextual markup:
the <date> element should always have the @when attribute and the value associated should always follow the year-month-day format for example <date when="1888-07-31">

the <orgName> element just needs the @ref attribute with corresponding value associated to the specific organization (list of these xml:ids are in Issue #15) when used with any organization that is not a company being discussed by Nelson or a working girl.
When it is such a company you add the attribute type="exposedCompany" in addition to having the @ref attribute with the corresponding value associated to the specific company (list of these xml:ids are in Issue #15).

the <persName> element is going to surround any person's name whether it is just a first name, nickname, last name, or full name reference. Although we are not using this element specifically his semester for data analysis this is information that might be needed in the future.

the <placeName> element can be left without any attribute when used for the name of a place generally for example around the name of a city or country otherwise we add the attribute @type and the associated value "address" and when used in reference to the exact address of a company we add the @ref attribute and corresponding value associated to the specific company (list of these xml:ids are in Issue #15).

the <said> element gets the attributes @who and @ana

  • @who can equal any of these values: "unidentified", "workingGirl", "nellNelson", "foreperson", "employer" (employer references the owner of the company only be careful not to confuse this with a foreman there is a distinction), "employee" (referring to any employee that is not at the same level of labor as the working girls for example a clerk or secretary), "benefactor" (this is used in the instance in once specific article where a man gives Nell carfare but does so with ill intentions and this is the name used for that man by Nelson so we can hold onto that), and finally "messenger" (this will be for any person that is speaking on behalf of someone else and we frequently see this with relations of working girls like a mother or son coming into the company inquiring about work or informing of illness and Nelson documents the dialogue had between the messenger and the employer or foreperson)
  • @ana can have the following associated values "unknown", "female", or "male"

the <rs> element can have the attributes @type, @subtype, and @resp

  • @resp includes the # and the xml:id of the interpreter
  • @type can have the following values "interruption" (used as an indicator when Nelson interferes some kind of dialogue with journalistic additives for example he the told me or she cried appearing in between quoted material), "wageDesc" (any conversation discussing wages), "livingCond" (any conversation discussing how a person's living conditions are), "workEnvir" (any conversation discussing the physical environment being worked in), "workDesc" (any conversation describing the work being done), "wgDesc" (description of a working girl individually or group of working girls for example the following two sentences would get that @type : "Her face was sad and so very, very pale that I shall never look at a jersey again without seeing her face." and "The average age may have been 23, but not less. There were girls of 17 and 18 and some world-weary women past 50 all working for little more than enough to keep body and soul together."), and "personDesc" (description of an individual that is not a working girl).
  • @subtype can have the following values "postive", "negative", or "mixed" (explained in Issue #16)

HOPE THIS HELPS!

xml:ids for Companies to correlate Companies and Addresses

Here are a few example of how your <orgName> and <placeName type="address"> should be coded using the @ref attribute. The @ref attribute that will correspond to an xml:id in our Site Index. In this issue please list any companies that you identify in your articles with the corresponding xml:id you chose for the company and refer to this issue before assigning new xml:ids to be sure no one else has already come across the company and given it an xml:id. Thanks everyone @KariWomack @CodyKarch @spadafour !!

This is how the code looks with text sitting around the coded information ... this is an example of how we code company names as well as the corresponding addresses:
The fifth place on my list was the "<orgName ref="#WLMC" type="exposedCompany">Western Lace Manufacturing Co.</orgName>" <placeName ref="#WLMC" type="address">218 State street</placeName>

The following do not have the surrounding text so as to simplify the examples:

This is an example of how you put a ref id on an orgName that isn't specified as a company:
<orgName ref="#WPA">Woman's Protective agency</orgName>

This is an example of how you put a ref id on an orgName that is specified as a company but sits separately from the given address:
<orgName ref="#RCRB" type="exposedCompany">Rosenthal &amp; Co.'s and Rosenberg Bros</orgName>

This is an example of how you put a ref id on an orgName that is specified as a company and sits with a partial address:
<orgName ref="#Stein" type="exposedCompany">Stein's</orgName>, on <placeName ref="#Stein" type="location">Market street</placeName>

Meet-Up Times!

Hey, just so that we get some constant work done for the Nelson project, me and @KariWomack are going to meet in the classroom Wednesday (11/4/15) and Friday (11/6/15) at 6pm to 8pm this week!!! If anybody would like to join us to accomplish stuff, @rCarls @spadafour @RJP43 , we will be there! :D

Nell Nelson Biography

One important aspect of the Nell Nelson project is to restore the biographical information of the woman behind the pseudonym. Nell Nelson's real name was Helen Cusack (after marriage Helen Cusack-Carvalho). As members of the restoration project we need to pull our resources together to find source documents and really ANY information available on Cusack as well as her family.

What we know:

  • Maiden name: Helen Cusack (sometimes misspelled Cusach)
  • Husband: Solomon Solis Carvalho
  • Two daughters: Sarah Virginia Creshore and Helen Steele
  • Raised her family in New Jersey, but she was born in Missouri. Solomon and Helen met in New York while both working on the New York World newspaper.

What we want to be looking for:

  • death records
  • census records
  • birth records
  • personal correspondance
  • other articles written by Cusack (likely under Nell Nelson)
  • any documents regarding Nelson or Cusack

Where to look:

  • If any of you have memberships to family tree sites (ie. geni or ancestry.com) search and pull information from documents available through there
  • local Government archives
  • historical societies

Please if you find any information upload to this GitHub in the ResearchDevelopment folder! and comment here with questions or to alert us of your findings. Happy hunting! :)

@spadafour
@rCarls
@KariWomack
@CodyKarch

Archetype Network Graphs

@nlottig94

  • Create legend on each of these archetype network graphs
    (City Slave Girl, Foreperson, Employer, Employee, Nelson)
    Please reference our bar graphs for legend styling.

Legends should include:
people/relationships -- node color #66FF66
body/self references -- node color #FF6666

  • Eliminate excess whitespace to the right of the networks

  • If you have time, add legend and eliminate excess whitespace on the other archetype network graphs (Messenger, Benefactor)

  • If you have time, eliminate excess whitespace on the other article specific grammar network graphs

  • If you have time, create legend for existing article specific grammar networks with my assistance

The tagging saga continues ... Version Control

@ebeshero @spadafour So when we are marking up the versoning between texts I am trying to figure out if it makes more sense to have elements like <placeName> and <persName> and <orgName> surrounding the <app><rdg> setup or to have the just mentioned elements sitting inside each of the different <rdg> elements.

Easier to explain with an example:
Here is the text as it originally appears in the articles:
the labyrinth that is known as H. Schultz &amp; Co.'s paper-box manufactory, 34 to 38 East Randolph street.

Here is the text as it originally appears in the Barkley text:
the labyrinth that is known as H.S. &amp; Co.'s paper-box manufactory, on R--- street.

Option 1
the labyrinth that is known as <orgName ref="#HSC" type="exposedCompany"><app><rdg wit="#CT021">H. Schultz</rdg><rdg wit="#WSGC23">H.S.</rdg></app> &amp; Co.'s paper-box manufactory</orgName>, <placeName ref="#HSC" type="address"><app><rdg wit="#CT021">34 to 38 East Randolph</rdg><rdg wit="#WSGC23">on R---</rdg><!--rjp: em dash???--></app> street</placeName>
I like this way because we can see exactly what parts are kept versus changed and there isn't the repetition of elements so we can still do a count of all the companies by <orgName> and it won't be distorted.

Option 2
the labyrinth that is known as <app><rdg wit="#CT021"><orgName ref="#HSC" type="exposedCompany">H. Schultz &amp; Co.'s paper-box manufactory</orgName></rdg><rdg wit="#WSGC23"><orgName ref="#HSC" type="exposedCompany">H.S. &amp; Co.'s paper-box manufactory</orgName></rdg></app>, <app><rdg wit="#CT021"><placeName ref="#HSC" type="address">34 to 38 East Randolph street</placeName></rdg><rdg wit="#WSGC23">on <placeName ref="#HSC" type="address">R--- street</placeName></rdg></app>

I am leaning towards Option 1, but I wanted opinions?

I am tagging the Dickinson team here as well because they are used to this versioning setup. @nlottig94 @brookestewart

Official Submission Spring 2016

@ebeshero @ghbondar
@RJP43 and I are officially submitting our project for grading. This is what we were able to produce in the time allotted; there are aspects of this site that we know need to be improved. Since this project is ongoing, these are the things we plan to continue with in the future:

-We could not implement PHP within the reading views in time
-We did not get javascript working on the single article ready for versioning, 8/19
-We have yet to highlight our grammar markup using javascript that links the network analysis to readable text
-We have yet to map locations

What we have done:
-Complete overhaul of the site
-Complete codebook
-Complete XML on all Chicago Times articles
-CSS and layout overhaul
-Multiple reading Views
-Grammatical Network Data Visualization

What needs to be done:

@spadafour Here is where is our big check list:

Network Analysis (grammar)

  • fix data error for grammar tsv needed for cytoscape
  • output readable cytoscape directed network analysis
  • try implementing PHP on SVG of the cytoscape network analysis

Mapping

  • finish testing mapping programs that overlay maps
  • use photoshop and image mapping to mark out Nelson's route per article
  • try implementing PHP on image mapping to toggle each route based on data pulled from server on each article (linking article and route, but map image itself never changes)

XML to HTML

  • use PHP to keep main HTML on page static and only change inner div where text appears based on the XSLT toggled and article selected
  • create XSLT that creates versioned text in some kind of comparing manner
  • create XSLT that highlights gender markup (toggling gender dialogue)
  • create the basic XSLT that gives plain text ??
  • create XSLT that highlights grammatical markup (maybe with tables at bottom/side of article text that gives the parts (words) of each phr that is in the seg containing it all)
  • fix faulty links on About and other current HTML pages
  • fix CSS on current SVG page

Other

  • update Fall 2015 SVG to include new articles and archetypes
  • make Fall 2015 suggested revisions
  • update Bio info for editors and figure out better formatting ??
  • placeName network analysis ??

Network Analysis and Data Viz

@RJP43 @spadafour @nlottig94 Here are some resources to get you started with network analysis:

  1. My blog:
    https://digitalromanticist.wordpress.com/2013/08/23/spectacular-intersections-of-place-in-southeys-thalaba-the-destroyer/

  2. My XML to Cytoscape tutorial: http://ebeshero.github.io/thalaba/cytosc.html

  3. Obdurodon projects using network analysis:

  1. Not network analysis, but worth looking under the hood at the code for article interface and other visualizations: http://presidential.obdurodon.org/index.xhtml

12/17 Project work

@RJP43 @CodyKarch @KariWomack
Hey guys! Just letting you know that I'm up at Press Room 122 in Village Hall working. I know you guys won't be ready to work until later, but just wanted to give you an update.

Correlate Newspaper Files with Book Files

You need a convenient way to tell which newspaper articles go with which portions of the two book publications, so you can line them up together. I'd suggest a Wiki Page first to hold this information for the team.

Meeting Times

I am available MWF from 11:30-1, and after 3:00, or T TH, until 2:30, from 4-5:15, or after 7.

Help with XSLT -- Suggestions Please!!

@ebeshero @spadafour @nlottig94 @brookestewart

So the final reading view for the XSLTs that I need to finish is the one that highlights our grammatical markup and I am struggling to come up with a logical setup. So previously when we were highlighting our old grammatical markup we had a table running to the left of the article text basically listing all of the nouns and their corresponding adjectives. See here for how we were doing that before! We can do this again, but I wanted to see if anyone had any better ideas???

Also I wanted to see about outputting the SVG of each article's network graph before the output of the text (and possible tables), so that when the site user selects to view the article using the grammar xslt the network analysis for that specific article pops up and then as the user scrolls down you get the article text and possibly the tables if we decide on keeping those. What I am struggling with however is grabbing just what is necessary from the file (the group of them can be seen here) and calling on it in my XSLT. Since we are using PHP it is necessary that the XSLT can generically call on any given SVG according to the matching date that I already have the XSLT calling on in the <title> element of the HTML. I thought that by having the date in each of the network graphs' filenames it would be easier to match. I'm not sure if its even possible, but I think that would be cool. I guess the other thing we can do if we can't grab a piece of the SVG file is we can have all of the graphs output on a page under analysis with an id surrounding the div of each and just have a link that when the user clicks can jump them to the network analysis page where that article's network is, but I think having it all on one page would be a lot cooler! Maybe XQuery is the better place for that, but I guess then we would have to figure out how to get the rest of what I already have in this XSLT working using XQuery.

Any feedback and suggestions are welcome!!!!

Network Analysis Discussion: Bimodal and Directed

@RJP43 and @spadafour Rob and I were discussing your ideas for plotting a network analysis from CitySlaveGirls. First of all, just to get your feet wet:

  1. You might just want to try this as a prototype for this homework from a bundle of files you know is in good shape (as in, just isolate the files that have the data you want).
  2. It seems pretty clear that what you are creating is a directed network, rather than the undirected kind of network that I'm describing in my assignment. That is because (as I understand it) you're wanting to plot:
    Source-Node: An agent or person (from the "archetypes" via the pronouns/possessive nouns)
    Shared-Interaction: the active grammatical relationships (from the "segs" markup)
    Target-Node: the object (whether a material object or even a person or group of people being objectified!)
  • So this is a directed network because the agent is exerting power (if you will) over an object or defining a relationship to it. The directional flow goes from Agent to Object. When you go to plot your graph, you'll want to select "directed" network (not undirected as I'm leading people in the homework assignment).
  • And this is a bimodal network, which means that your Source and Target nodes are of two different kinds. Your source and target nodes are going to lose their distinctiveness in Cytoscape when it converts your raw network data into a graph, so you want to output attribute columns (a source-node attribute and a target-node attribute): Those attributes should contain a piece of text that helps you flag when a particular node is an Agent or an Object.
  • When you output this in Cytoscape you'll be able to refer to those attribute columns in order to control things like the shape or color of your output nodes. For a bimodal network, you might want to make your shapes be different: say, to link circles to triangles or something to help distinguish Agents from Objects.

Hope that makes sense, and I'm eager to see how your network experiment comes out!

Moving into TEI

So as we have discussed the Nell Nelson project currently uses a customized RelaxNG schema of approx. 40 different elements and attributes and we need to move forward with converting these tags into TEI tags.

Three Steps to Convert to TEI:

  1. Choosing TEI elements/attributes that correspond well with the tags we already have in place
  2. Choosing other aspects of the project we might be interested in coding and choosing TEI tags that will best represent this new research
  3. Using regualr expressions to find the old tags and then going forward with replacing them with the new TEI tags.

These are the main components of our files that we need to make sure we represent with our new TEI tags:

  • structure --- newspaper versus book (ie. newspaper - headlines, subtitles, date information and books - chapters, chapter titles, paragraphs)
  • versioning --- we want to find a way to link all three sources together by pointing out what is different and this is called versioning (we need to figure out the best way to do this so that if something is mentioned in the original articles and not mentioned in the book sources we can notice the variation to see if there are trends in what is excluded -- trust me there is!)
  • conversation or dialogue --- an interesting aspect that previous editor Shane Daube pursued was the varying connotations between people speaking with one another. We need to find a way to convert these voice tags.
  • content --- descriptions of working conditions, mentions of unionization/labor reform, descriptions of living conditions, etc.
  • references --- to people, places (locations), companies

To begin:

  1. Each of us needs to read through the couple of articles and chapters already marked up and take note of elements in place and aspects of the text that need developed by adding new elements
  2. Go into the TEI guidelines and find sensible elements and attributes to convert to
  3. Begin discussion in this issue with TEI elements/attributes that make sense that we can agree on

After we have decided on our tags I will write up a codebook and we can begin converting the few documents we have marked up into TEI and possibly begin the markup of some of the other transcribed and need-to-be transcribed files.

@KariWomack
@spadafour
@CodyKarch
@rCarls

Rules to Constrain with Schematron:

Rules to Constrain with Schematron:
<said> - and corresponding @who with the values of "unidentified", "workingGirl", "nellNelson", "foreperson", "employer" "employee" "benefactor" "messenger" and @ana with the values "male", "female" "unknown"

... add more here with questions ...

The tagging saga continues ... Grammatical Markup

  • Remove markup of adjectives linked with nouns: EX. <seg><w type="adj" ana="crochet">crochet</w>-<w type="noun">teacher</w></seg>
  • Remove markup of adjectives linked with other adjectives: EX. <seg><w type="adj" ana="2">two</w> <w type="adj" ana="feather">feather</w> <w type="noun">factories</w></seg>
  • Remove floating noun markup (nouns without associations): EX. <w type="noun">
  • Remove floating adjective markup (adjectives without associations): Ex. <w type="adj" ana="impenetrable">impenetrable</w>
  • Keep the markup that points to possessive pronouns and the linked nouns (what is being possessed): Ex. <seg><w type="adj" subtype="poss" ana="#nellNelson">my</w> <w type="noun">list</w></seg>
  • Review all articles for possessive pronouns and possessive adjectives and the related nouns
    possessive
  • Mark related possessed nouns with archetype references where necessary
  • Edit schematron rules accordingly so that only rules regarding possessive adj and possessive pronouns are allowed in the <seg>
  • Edit codebook accordingly

UNRH Conference

Huzzah! The Restoration of Nell Nelson project has been accepted to present at the UNRH conference (to view details about the conference please visit http://unrh.org/)

There is opportunity for me to take fellow team members to this conference if anyone else is interested. I have to respond to the approval invitation by October 5th; therefore, it is very important that if you are interested in attending this conference that you comment on this issue!

We will be travelling and arriving on Thursday, November 5th and the conference begins early morning November 6th and runs through Sunday morning (approx. 11AM), November 8th. We will travel back on Sunday and be home for Monday classes.

So pleae if you are interested this is the time and place to let me know :)

Editor Biographies

In addition to completing the tasks of transcription and version linking I posted about in Issue #7 and Issue #9, I would like all of you to create a brief biography to include on the site.
screenshot_1 This is what I currently have on the site and would like each of you to have something a bit more than what is here. Feel free to include links. To edit your bio. first sync our Nelson repository in your Desktop Client. Then in the file explorer find your way to the website folder. The biographies are on the about.xhtml page that sits directly inside of that folder. Simply open that file in oXygen and edit your bio. at the bottom of the page then save. Don't forget to commit your changes and sync again on your Desktop Client so that the file is pushed to GitHub for all of us to see. Ping me here with any issues and once you have completed this task!! Thanks!!

Credit for Progress

Hey @RJP43 , I was wondering what our tasks for the weeks of 11/20, and 11/27 would be? I believe you would like us to keep working on articles, but I don't know what your say is for the credit in these weeks. Let me know, thanks!
@rCarls @KariWomack @spadafour

Notes for Revision on Nell Nelson Website

Here are some suggestions to supplement the project review I sent last night:

  • Add a separate Methodology page to feature blocks of code and explanations of plus links to software / technology you're using.
  • The site could use some explanation and/or representation of your Site Index. On your explanation of the network analysis, you say in passing that "orange-border nodes are site-index nouns or archetypes that are indirectly possessed", but that language, even using "site-index" as an adjective here, presumes that your site visitors are familiar with your site index. As the site is currently configured, your visitors can only know of the site index by reading your Codebook here on GitHub (offsite). That's a problem.
  • Consider an HTML rendering of your site index as a new page!
    @RJP43 @spadafour

Editor Biographies and Site Index Ids

Biography:
Nicole Lottig is a 2015 graduate from the University of Pittsburgh at Greensburg, majoring in Cultural Anthropology with a minor in Gender Studies. She is currently the Technical Assistant for Pitt-Greensburg's new Center for the Digital Text, as well as an intern for the Academic Programs International Campus Advocate program. Nicole plans to attend graduate school in the future.

Site Index: #nll

Data for *Possible* Mapping

@ghbondar

Here are the links to the two files that contain all of our place references within 1888-Chicago.

This one gives you companies and their addresses : http://dxcvm05.psc.edu:8080/exist/rest/db/Nelson/NellCompaniesAddresses.tsv

This one is just the places that we call "local references" :
http://dxcvm05.psc.edu:8080/exist/rest/db/Nelson/NellLocalPlaces.text

This issue would probably be the best place to post any results since everyone on the team has access.

Thank you so much!

Summer Transcription and Coding Work

Be great if each of us can work on finishing these transcriptions and encoding sometime this summer. If you finish early and would like another assignment ping me in this issue. Please use this template and reference our codebook

Nicole: Article_1888-08-30

Rob: Article_1888-08-15

Brooke: Article_1888-08-17

Becca: Fix 1888-08-10 and 1888-08-08

Please save outside of the XML_OnSite folder ... here ... so that the schematron line provided in the template works for you.

Sorted SVG elements with xsl:sort over an Array of Variables

@RJP43 @spadafour The solution to the question of how to sort wasn't too hard, but a little different from what we're used to. The <xsl:sort> does need to sit inside an apply-templates or an xsl:for-each, but if we stop and think about it, we can set up a for-each to loop through nothing else but the value of your list of eight variables for the various speaking voices in your collection. Using <xsl:for-each> here was the secret to reducing the total lines of code so you don't have to repeat so much. Inside it, I set a variable for stroke-color, too, and then I needed to only write one definition for an SVG element <line> and one for an SVG element <text>.

Here's a direct link to the XSLT file.

Using PHP to transform XML to HTML with XSLT

PHP: Manual

The process for transformation looks like this:

  1. Create a PHP file that accepts 2 parameters from the URL
    a) XML document location
    b) XSL document location
  2. Use the DOMDocument object for both files using the load function
    $xmlDocument = new DOMDocument;
    $xmlDocument->load($xmlParameter);
  3. Then use a XSLTProcessor object to generate the HTML output
    $processor = new XSLTProcessor;
    $processor->importStylesheet($xslDocument);
    $processor->transformToURI($xmlDocument, 'file:///output.html');

The above example assumes the XML, XSL, and PHP file are all in the same directory on the web server. You also need to ensure that the XSL module is installed and enabled in PHP on the web server as well.(not enabled by default).

Note: You could then use AJAX to return the PHP directly to the current web page without having to send the end-user to another web page after they click the link.

Full PHP code example below:

<?php 

$xmlParam = $_GET["xml"];
$xslParam = $_GET["xsl"];

$xml = new DOMDocument;
$xml->load($xmlParam);

$xsl = new DOMDocument;
$xsl->load($xslParam);

$proc = new XSLTProcessor;
$proc->importStylesheet($xsl);

$proc->transformToURI($xml, 'file:///var/www/html/out.html');

echo file_get_contents('out.html');

?>

Call the php file with a URL that looks like:
/transform.php?xml=xmlFile.xml&xsl=xslFile.xsl

Schematron Issue

@RJP43 @spadafour I am currently transcribing the article Becca assigned me, and I noticed that since the GitHub was rearranged, there is an issue with the Schematron. I don't know if the file path needs changed? I just wanted to let you guys know!

Article Transcription -- Spring 2016

@nlottig94

As you requested:

If you want to get started on an article I think a nice early one to begin that some how has been skipped over would be the article from 1888-08-01. The following should be everything you need. Comment in this issue with any well issues .. lol

The pdf of this article can be found on this repo. here or you can view the article by going to it on the NYU Site.

Use this template to guide you for the TEI header and the general structural tagging format, but refer to this finished article because there may be some mistakes in the template that have yet to be fixed since the drastic editing of the articles at the conclusion of last semester.

This is the information from the site index that will be helpful in editing the TEI header
<bibl xml:id="CT003"> <title level="a">"City Slave Girls" <date when="1888-08-01"/></title> <title level="s">Chicago Times</title> <author>Nell Nelson</author> <editor>Chapin, Charles</editor> <edition>Newspaper Print</edition> <extent>21-part series: Front-page placement</extent> <publisher>Chicago Times</publisher> <pubPlace>Chicago</pubPlace> <date>1888</date> <publisher>New York University Digital Library Technology Services (DLTS)</publisher> <availability> <p> <ref target="http://dlib.nyu.edu/undercover/city-slave-girls-nell-nelson-chicago-times-aka-white-slave-girls-new-york-world">www.dlib.nyu.edu</ref> </p> </availability> <note/> </bibl>
and the entire site index can be reviewed here

if you have difficulty reading the pdf (you are not alone) and you can reference either Barkley Section 6 or 7 to see if you can find the missing text. I say 6 or 7 because we have yet to confirm if the article you will be working on is covered in both of these sections or just one of them. You can find the Barkley sections here. Please confirm for us when you find out what section(s) are associated.

Important to note if you cannot transcribe something due to the quality of the image or due to the newspaper print follow the guidelines in our code book for damaged and unclear text. Even if you refer to the Barkley text and get the correct text still put a damage and unclear tag around the provided text to inform us that the text provided is not directly transcribed from the pdf image. Also reference this codebook for irregular spellings and how we are handling those. These things are the first two (of the three) rules listed.

Don't worry about marking up grammar tagging; however, you can throw simple <placeName> tags around references to places, <orgName> tags around company or organization references, <persName> tags around specifically named people, and <said> tags around any quoted conversations. There will need to be some editing with those tags as they are very generic and we have a more complex system of attributes with those in place that I can explain better to in person or later in another issue.

Thanks for wanting to be a part of this and @spadafour and I welcome you to the Nelson Team :)

Nelson Team Web Space Ready!

The Pgh Supercomputing folks have set up the web space for the Nell Nelson project.
To access your space, using WinSCP or another SFTP program of your choice, do this:

  1. go to access your own web space first.
  2. navigate up to a parent directory, and do that again. (So, go up two levels.)
  3. then, navigate down the following: var/www/html

You'll see three folders: dickinson16, ebb8, and nelson. Your project team's is the nelson folder, of course, and you can get started building in there. @RJP43, you may wish to let the team start with a clean slate in here, since the website and project files are all stashed here in GitHub. (That's up to you and your team.)

Note: The entire class can access both shared project folders--so just be careful not to write project files into the wrong directory!
@RJP43 @ghbondar @CodyKarch @rCarls @spadafour @KariWomack

Checkbox Javascript???

How do we toggle elements with a specific class to show/hide via a checkbox? I've been googling solutions, and this is the best I could come up with:

// assign function to onclick property of checkbox
document.getElementById('active').onclick = function() {
    // call toggleSub when checkbox clicked
    // toggleSub args: checkbox clicked on (this), id of element to show/hide
    toggleSub(this, 'active_sub');
};

// called onclick of checkbox
function toggleSub(box, id) {
    // get reference to related content to display/hide
    var el = document.getElementById(id);

    if ( box.checked ) {
        el.style.display = 'block';
    } else {
        el.style.display = 'none';
    }
}

Help - edit Keystone DH proposal

@ebeshero @ghbondar @nlottig94 @brookestewart @spadafour

Hi everyone,

I made a significant amount of edits to our original Nelson conference proposal (per @ebeshero 's suggestions) and now I am about 70 words over. If any of you could find the time to read over this edited proposal and suggest where is can be cut down or what needs further editing it would be greatly appreciated. Changes can be made to the file or commented on in this issue. Seeing as this needs done by Monday I would like to be done editing it by tomorrow.

Thanks everyone!

12-4 Class Meeting

We reviewed:

  • Specifics of tagging articles
  • Cody - worked on CSS

For Next Class:

  • Everyone = Further Tagging of Articles: (again, refer to 10/30)
  • Cody - more work on CSS (rip out methodology and move it to About...drop down boxes in there?)
  • Rob - new system of tags

@CodyKarch @KariWomack @RJP43

Also....
For tagging complex paragraphs containing multiple instances of attributes and connotations, here is the idea that @ebeshero and I discussed after class:

  • What happens when we have both positive and negative connotations within a single block of text, and what do we do when Nell Nelson adds a break in the sentence?

"You have a nice house," she said. "I would like to live in it. It is a nice house. The house is warm, and I like warm houses. However, your couch is too squishy and I do not like squishy couches. Nevertheless, I would still like to live in this house."

  • To take care of the connotation problem, we could have the options of positive, negative, and mixed(this is the new one). Then, to take care of the break, we could wrap it in some element to designate it as such, but it remains contained within the same block of speech.

"<said who="workingGirl" ana="female"><rs type="livingConditions" subtype="MIXED">You have a nice house,<elementToDesignateInterruption>" she said. "<//elementToDesignateInterruption>I would like to live in it. It is a nice house. The house is warm, and I like warm houses. However, your couch is too squishy and I do not like squishy couches. Nevertheless, I would still like to live in this house.</rs></said>"

  • We would just need to designate what the element break would be named (I believe this is the solution we discussed).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.