nfdi4plants / arctrl Goto Github PK
View Code? Open in Web Editor NEWLibrary for management of Annotated Research Contexts (ARCs) using an in-memory representation and runtime-agnostic contract systems.
License: MIT License
Library for management of Annotated Research Contexts (ARCs) using an in-memory representation and runtime-agnostic contract systems.
License: MIT License
Describe the bug && To Reproduce
I create a QAssay from byte [] stream and covert to QSheet list as such:
let ms = new System.IO.MemoryStream(byteArray)
let _,assay = ISADotNet.XLSX.AssayFile.Assay.fromStream ms
let tables = QueryModel.QAssay.fromAssay assay
tables.Sheets
|> List.map (fun (s: QSheet) -> s.Inputs) // map over QSheets
|> printfn "[INPUTS]: %A"
As you can see the fields with value are correctly written as Some Source
, whereas the other are written as Some Sample
.
Expected behavior
All fields are Some Source
.
Describe the bug
QProcessSequence.ValuesOf(node,ProtocolName) does not return the correct values as stored in isa.assay.xlsx
To Reproduce
I created a test.fsx in an sample arc which reproduces the error, I invited HLWeil
Expected behavior
calling:
#r "nuget: arcIO.NET, 0.0.6"
#r "nuget: ISADotNet.QueryModel, 0.7.0-preview.5"
open ISADotNet
open ISADotNet.QueryModel
let arcPath = __SOURCE_DIRECTORY__ + @"\..\"
let p,a = arcIO.NET.Assay.readByName arcPath "testassay"
let qa = QueryModel.QStudy.fromAssay a
qa.ProtocolNames
let allSamples =
qa.LastNodes()
|> Set.ofSeq
let getBioRep (fN:QNode) =
match qa.ValuesOf(fN,ProtocolName = "Growth").WithName("biological replicate").Values.Head with
| QueryModel.ISAValue.Characteristic x -> x.Value.Value.AsString
| _ -> failwith "no biorep please add"
let t1 =
allSamples
|> Array.ofSeq
|> Array.map getBioRep
should return the values stored in the isa.assay file:
Screenshots
Additional context
calling
let getGenotype (fN:QNode) =
match qa.ValuesOf(fN,ProtocolName = "GenotypeLib").WithName("Genotype").Values.Head with
| QueryModel.ISAValue.Characteristic x -> x.Value.Value.AsString
| _ -> failwith "no biorep please add"
/// Seems to return correct values
let t2 =
allSamples
|> Array.ofSeq
|> Array.map getGenotype
returns correct results
Assay files are modified using the swate tool. To access this information for other tasks, a reader should be added.
As the swate tool adds tables to assay xlsx files, a prerequisite for this reader is a table reader in FSharpSpreadsheetML. The information could be stored as grouped processes.
At the moment ISADotNet uses a function, heavily abusing reflection to append lists/seqs and arrays as obj
.
let inline appendGenericListsByType l1 l2 (t:Type) =
System.Reflection.Assembly
.GetAssembly(typeof<_ list>)
.GetType(if isArray then "Microsoft.FSharp.Collections.ArrayModule" else "Microsoft.FSharp.Collections.ListModule")
.GetMethod("Append")
.MakeGenericMethod(t)
.Invoke(null, [|l1;l2|])
All of the functions used here are not fable compatible, so how can we translate it?
I tried using #if FABLE_COMPILER
to give an alternative solution in which we just assume the type to be a list.
List.append
#if FABLE_COMPILER
!!List.append l1 l2 // `!!` means the compiler should ignore any typechecks here
#else
...
This solution works for lists but not for arrays or seq. In the case of arrays it will not append correctly, seqs are appended correctly but do not match the seq type anymore.
So i tried another solution, in which i wanted to match the list to any Array type. This works in dotnet in a .fsx but this does not even compile, as fable cannot do such checks on runtime, as js does not support it.
match l1 with
| :? System.Array as arr ->
!!Array.append l1 l2
| _ ->
!!List.append l1 l2
The same thing goes for an if...else
with l1.GetType().IsArray
.
warning FABLE: Types can only be resolved at compile time. At runtime this will be same as `typeof
-> Therefore in fable l1.GetType().IsArray
on a inline functions with obj with always resolve as obj and will never return true.
Emit can tell fable to change the output of a function fully to any js code written inside the Emit attribute.
Fable uses special classes to represent fsharp IEnum types, with different append methods, so there is no one function to rule them all and would require typechecking at runtime again (which does not work). Unless there is a, to me unknown way of converting all of these fable classes to a generic js array, this will also not work.
How do f# IEnum types look in js? repl
let l = [1 .. 20]
let a = [|1 .. 20|]
let s = seq [1 .. 20]
import { toArray, toList } from "fable-library/Seq.js";
import { rangeDouble } from "fable-library/Range.js";
export const l = toList(rangeDouble(1, 1, 20));
export const a = toArray(rangeDouble(1, 1, 20));
export const s = toList(rangeDouble(1, 1, 20));
It might be necessary to remove the reflection in this file and instead use type save functions based on generics instead. This would result in a rather large redesign.
I am open for suggestions on how to solve this issue @HLWeil @muehlhaus @kMutagene
Describe the bug
This is a follow up issue to #51. I added an input column Source Name
and put the previous information from Data File Name
in there and added artificial names to the now empty output column. These columns now look like this:
Data File Name | Source Name |
---|---|
result1 | DB_097_CAMMD_CAGATC_L001_R1_001.fastq.gz |
result2 | DB_099_CAMMD_CTTGTA_L001_R1_001.fastq.gz |
result3 | DB_103_CAMMD_AGTCAA_L001_R1_001.fastq.gz |
result4 | DB_161_reC3MD_GTCCGC_L001_R1_001.fastq.gz |
result5 | DB_163_reC3MD_GTGAAA_L001_R1_001.fastq.gz |
result6 | DB_165_re-C3MD_GTGAAA_L002_R1_001.fastq.gz |
The DAG will now be displayed, but i found two odd occurences:
As you can see the DAG shows that sheet 3 and 4 are applied twice. I cannot find a reason why this should be inteded, maybe you can help me out here.
Describe the bug
I use the code block below in Swate to display Swate tables as Viz in embedded Html. For the attached assay.xlsx file i get a non descriptive error "Object reference not set to an instance of an object."
. I found that the last slide 4COM01_RNASeq
is missing an Input Column and when i add it the error is gone.
let factors, protocol, assay = JsonExport.parseBuildingBlockSeqsToAssay worksheetBuildingBlocks
let processSequence = Option.defaultValue [] assay.ProcessSequence
/// This function throws the error, all above works
let dag = Viz.DAG.fromProcessSequence (processSequence,Viz.Schema.NFDIBlue)
let dagHtml = dag |> CyjsAdaption.MyHTML.toEmbeddedHTML
To Reproduce
See Bug description
Expected behavior
Make the error message more descriptive for the user
Implement IISAPrintable for more types, to improve readability of bigger isa objects.
While testing the new library strcuture with the fslab-docs template it returned an error.
API docs:
generating model for 2 assemblies in API docs...
loading 2 assemblies...
registering entities for assembly ISADotNet...
registering entities for assembly ISADotNet.XLSX...
Error :
FSharp.Compiler.ErrorLogger+UnresolvedPathReferenceNoRange: Assembly: DocumentFormat.OpenXml, full path: DocumentFormat.OpenXml.Spreadsheet.Row
at FSharp.Compiler.TypedTree.CcuThunk.EnsureDerefable(String[] requiringPath) in F:\workspace\_work\1\s\src\fsharp\TypedTree.fs:line 5103
at FSharp.Compiler.TypedTree.NonLocalEntityRef.TryDeref(Boolean canError) in F:\workspace\_work\1\s\src\fsharp\TypedTree.fs:line 3157
at FSharp.Compiler.TypedTree.EntityRef.get_Deref() in F:\workspace\_work\1\s\src\fsharp\TypedTree.fs:line 3254
at FSharp.Compiler.TypedTreeOps.stripTyEqnsA(TcGlobals g, Boolean canShortcut, TType ty) in F:\workspace\_work\1\s\src\fsharp\TypedTreeOps.fs:line 739
at FSharp.Compiler.TypedTreeOps.tyargsEnc(TcGlobals g, FSharpList`1 gtpsType, FSharpList`1 gtpsMethod, FSharpList`1 args) in F:\workspace\_work\1\s\src\fsharp\TypedTreeOps.fs:line 8035
at FSharp.Compiler.TypedTreeOps.typeEnc(TcGlobals g, FSharpList`1 gtpsType, FSharpList`1 gtpsMethod, TType ty) in F:\workspace\_work\1\s\src\fsharp\TypedTreeOps.fs:line 8009
at Microsoft.FSharp.Primitives.Basics.List.map[T,TResult](FSharpFunc`2 mapping, FSharpList`1 x) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\local.fs:line 247
at FSharp.Compiler.TypedTreeOps.XmlDocArgsEnc(TcGlobals g, FSharpList`1 gtpsType, FSharpList`1 gtpsMethod, FSharpList`1 argTys) in F:\workspace\_work\1\s\src\fsharp\TypedTreeOps.fs:line 8040
at FSharp.Compiler.TypedTreeOps.XmlDocSigOfVal(TcGlobals g, Boolean full, String path, Val v) in F:\workspace\_work\1\s\src\fsharp\TypedTreeOps.fs:line 8090
at FSharp.Compiler.SourceCodeServices.SymbolHelpers.GetXmlDocSigOfScopedValRef(TcGlobals g, EntityRef tcref, ValRef vref) in F:\workspace\_work\1\s\src\fsharp\symbols\SymbolHelpers.fs:line 541
at FSharp.Compiler.SourceCodeServices.FSharpMemberOrFunctionOrValue.get_XmlDocSig() in F:\workspace\_work\1\s\src\fsharp\symbols\Symbols.fs:line 1845
at FSharp.Formatting.ApiDocs.CrossReferences.getXmlDocSigForMember(FSharpMemberOrFunctionOrValue memb) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\GenerateModel.fs:line 575
at FSharp.Formatting.ApiDocs.CrossReferenceResolver.registerMember(FSharpMemberOrFunctionOrValue memb) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\GenerateModel.fs:line 647
at FSharp.Formatting.ApiDocs.CrossReferenceResolver.registerEntity(FSharpEntity entity) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\GenerateModel.fs:line 669
at <StartupCode$FSharp-Formatting-ApiDocs>[email protected](FSharpEntity arg00) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\GenerateModel.fs:line 2149
at Microsoft.FSharp.Collections.SeqModule.Iterate[T](FSharpFunc`2 action, IEnumerable`1 source) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\seq.fs:line 497
at FSharp.Formatting.ApiDocs.ApiDocModel.Generate(FSharpList`1 projects, String collectionName, FSharpOption`1 libDirs, FSharpOption`1 otherFlags, Boolean qualify, FSharpOption`1 urlRangeHighlight, String root, FSharpList`1 substitutions, Boolean strict) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\GenerateModel.fs:line 2149
at FSharp.Formatting.ApiDocs.ApiDocs.GenerateHtmlPhased[a](FSharpList`1 inputs, String output, String collectionName, FSharpList`1 substitutions, FSharpOption`1 template, FSharpOption`1 root, FSharpOption`1 qualify, FSharpOption`1 libDirs, FSharpOption`1 otherFlags, FSharpOption`1 urlRangeHighlight, FSharpOption`1 strict) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.ApiDocs\ApiDocs.fs:line 54
at <StartupCode$fsdocs>[email protected](Unit unitVar0) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.CommandTool\BuildCommand.fs:line 601
at <StartupCode$fsdocs>.$BuildCommand.protect@338(CoreBuildOptions this, FSharpFunc`2 f) in C:\Users\Kevin\source\repos\fsprojects\FSharp.Formatting\src\FSharp.Formatting.CommandTool\BuildCommand.fs:line 340
Maybe @kMutagene can help with this.
Describe the bug
If an Xlsx file that has been opened and saved in MS Excel before is edited via Worksheet.setSheetData
and saved, the file gets corrupted and the SheetData
of this file's worksheet cannot be obtained anymore.
To Reproduce
#r "nuget: FSharpSpreadsheetML"
open FSharpSpreadsheetML
let path =
System.Environment.GetFolderPath(System.Environment.SpecialFolder.UserProfile)
|> fun fp -> System.IO.Path.Combine(fp, "mySpreadsheet.xlsx")
let doc = Spreadsheet.init "mySheet" path
let sd = Spreadsheet.tryGetSheetBySheetName "mySheet" doc |> Option.get
SheetData.appendValueToRowAt None 1u "Hello, World!" sd
Spreadsheet.close doc
let doc = Spreadsheet.fromFile path true
let sd = Spreadsheet.tryGetSheetBySheetName "mySheet" doc |> Option.get
let wsp = Spreadsheet.tryGetWorksheetPartBySheetName "mySheet" doc |> Option.get
let ws = Worksheet.get wsp
setSheetData sd ws
Spreadsheet.close doc
Expected behavior
Uncorrupted file.
OS and framework information (please complete the following information):
Describe the bug
Person.removeFullName
returns empty Person lists under most conditions (detailed explanation under Possible solution(s)).
To Reproduce
Steps to reproduce the behavior:
arc init
arc i create
arc i person register
-> Fill LastName: Doe
and FirstName: John
exemplarilyLastName: Patternman
and FirstName: Max
)arc i person list
arc i person unregister
-> Fill LastName
and FirstName
with one of the persons from beforearc i person list
Expected behavior
Only one of the persons is gone.
Possible solution(s)
In \API\person.fs
, the removeFullName function is defined as follows:
let removeByFullName (firstName : string) (midInitials : string) (lastName : string) (persons : Person list) =
List.filter (fun p ->
if midInitials = "" then
p.FirstName = Some firstName && p.LastName = Some lastName
|> not
else
p.FirstName = Some firstName && p.MidInitials = Some midInitials && p.LastName = Some lastName
|> not
) persons
The compiler interprets the part
p.FirstName = Some firstName && p.LastName = Some lastName
|> not
as
(p.FirstName = Some firstName) && (p.LastName = Some lastName
|> not)
Either change to
(p.FirstName = Some firstName && p.LastName = Some lastName)
|> not
or, more elegantly, to
p.FirstName <> Some firstName && p.LastName <> Some lastName
Describe the bug
If an Assay consists of headers (regardless whether Parameter, Factor, or Characteristic) which contains unusual characters (namely (
or [
) the column cannot be parsed.
To Reproduce
Steps to reproduce the behavior:
hello (world)
and hello [world]
Expected behavior
Either
Currently, ISATab
style files can only be read and written as binary XLSX
files. This should be extended to also capture plain text Tab
files, as originally intented by the format specification.
This should be pretty straight forward for investigation files
. For assay
and study
files though, some planning is necessary, as plain text files don't support multiple sheets
like XLSX
files.
Change Create
static methods to optional parameters
Adjust naming: GetNameAsString
should be changed to GetName
, as name is already a string
For consistency, change methods to static methods
Describe the bug
The function ISADotNet.API.ProcessSequence.getOutputsWithCharacteristicBy does not return the Characteristics and Outputs of the given protocol. Instead, the Characteristics of the Protocol where the outputs serve as Input are returned.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The function should return the chraracteristics of the protocol that match predicate.
Successful building of the library is an absolute minimum criterion for all changes pushed to this repository. To ensure this CI should be added.
Building and testing similar to the one performed in BioFSharp
Would be nice to have an optional parameter/function which reads out Swate tables in an assay file, while also storing the information about their order somewhere in ISA-JSON. Preferably as comment.
The following code
let s = """
{
"characteristics": [
{},
{}
]
}
"""
JsonSerializer.Deserialize<ProcessOutput>(s,JsonExtensions.options)
fails with System.Collections.Generic.KeyNotFoundException: An index satisfying the predicate was not found in the collection.
Directly trying to deserialize a sample works fine:
JsonSerializer.Deserialize<Sample>(s,JsonExtensions.options)
ProcessOutput
is an AnyOf
union type
where sample
is case. It is therefore save to assume that the problem is specific to the AnyOf
deserializer.
Removing one of the two items inside the characteristics
property also alleviates the problem. So it seems this problem is also specific to item lists
inside an AnyOf
item.
It took me very long to find that you cannot write a single Assay via something like Assay.write
. Instead, even when only writing a single Assay file, this seems to be necessary:
open ISADotNet.XLSX
[Assay.empty]
|> Assays.write ...
Also this seems not be be the function that just writes a assay.xlsx file is it? does something like that exist?
So what i am looking for is something like Investigation.toFile
for a single Assay. Can this be done currently?
I would also suggest to drop the s from Assays
to provide a more unified API accross IsaDotNet and IsaDotNet.XLSX
A ISAXLSX
assay file reader
was implemented in the latest release. This reader creates an object of Type Assay. This is useful for interop with ISAJson
or the investigation file
, but can be cumbersome to scan through computationally in other cases. The problem here is, that the information which is depicted on a rowwise basis in the assay file gets dispersed into different places of the datamodel.
Instead, an additional obvious approach would be to read the assay file table as rows.
Describe the bug
A Study file that looks like this:
Source Name | Characteristics[c1] | Characteristics[c2] | Sample Name |
---|---|---|---|
src_1 | yes | smpl_1 | |
src_2 | yes | yes | smpl_2 |
cannot be written via the current API.
To Reproduce
#r "nuget: ISADotNet.XLSX"
open ISADotNet
open ISADotNet.XLSX
let createStudyProcess sourceName sampleName procName protName (characteristics: (string*string) list) =
let c =
characteristics
|> List.mapi (fun i (k,v) ->
MaterialAttributeValue.create(
Category=MaterialAttribute.fromStringWithValueIndex k "" "" i,
Value= Value.Name v
)
)
let src =
Source.create(Name = sourceName, Characteristics = c) |> ProcessInput.Source
let smpl =
Sample.create(Name = sampleName) |> ProcessOutput.Sample
let proc =
Process.create(Name=procName, ExecutesProtocol=Protocol.create(Name = protName), Inputs=[src], Outputs = [smpl])
proc
let s =
Study.create(
ProcessSequence = [
createStudyProcess "src_1" "smpl_1" "p1" "1" ["c1","yes"]
createStudyProcess "src_2" "smpl_2" "p2" "1" ["c1","yes"; "c2","yes"]
])
s
|> StudyFile.Study.toFile "test.xlsx"
Expected behavior
Writes study to file
Actual behavior
System.Exception: Could not write Study to Xlsx file in path "test.xlsx":
The lists had different lengths.
list.[0] is 3 elements shorter than list.[2] (Parameter 'list.[0]')
at Microsoft.FSharp.Core.PrintfModule.PrintFormatToStringThenFail@1439.Invoke(String message)
at ISADotNet.XLSX.StudyFile.Study.toFile(String p, Study study)
at <StartupCode$FSI_0013>.$FSI_0013.main@() in C:\Users\schne\Desktop\test\Untitled-1:line 155
Stopped due to error
OS and framework information (please complete the following information):
The API removeBy
functions (e.g. Assay.removeByFileName
) do not remove the target item but all other items in the list. This is because it is missing a logic not
.
I use the following code to ensure i have exactly one QSheet. Now my question is can in this case Protocols have any other number of items except 1? And what real life cases would be the reason for this?
let assay = Assay.fromString jsonString
let qAssay = QueryModel.QAssay.fromAssay assay
if qAssay.Sheets.Length <> 1 then
failwith "Swate was unable to identify the information from the requested template (<Found more than one process in template>). Please open an issue for the developers."
let template = qAssay.Sheets.Head
template //QAssay
template.Protocols // Protocol list
Describe the bug
Characteristics seem to just be written in the order they are passed to the process, ignoring their actual names
To Reproduce
#r "nuget: ISADotNet.XLSX"
open ISADotNet
open ISADotNet.XLSX
let createStudyProcess sourceName sampleName procName protName (characteristics: (string*string) list) =
let c =
characteristics
|> List.mapi (fun i (k,v) ->
MaterialAttributeValue.create(
Category=MaterialAttribute.fromStringWithValueIndex k "" "" i,
Value= Value.Name v
)
)
let src =
Source.create(Name = sourceName, Characteristics = c) |> ProcessInput.Source
let smpl =
Sample.create(Name = sampleName) |> ProcessOutput.Sample
let proc =
Process.create(Name=procName, ExecutesProtocol=Protocol.create(Name = protName), Inputs=[src], Outputs = [smpl])
proc
let s =
Study.create(
ProcessSequence = [
createStudyProcess "src_1" "smpl_1" "p1" "1" ["c1","yes1"; "c2","yes2"]
createStudyProcess "src_2" "smpl_2" "p2" "1" ["c2","yes2"; "c1","yes1"]
])
s
|> StudyFile.Study.toFile "test.xlsx"
Expected behavior
Writes correct study file:
Source Name | Characteristics[c1] | Characteristics[c2] | Sample Name |
---|---|---|---|
src_1 | yes1 | yes2 | smpl_1 |
src_2 | yes1 | yes2 | smpl_2 |
Actual behavior
the order of the second sample is wrong:
OS and framework information (please complete the following information):
To update the current API state we need to bring some of the type restrictions up to date with the DataModel changes done in previous commits. In addition we should redistribute the API modules over the .fs files in the API folder.
Is your feature request related to a problem? Please describe.
ISADotNet.API.Assay.existsByFileName
will return false if the given value is "assayFileName\isa.assay.xlsx" but the assay object contains the value "assayFileName/isa.assay.xlsx".
Describe the solution you'd like
The function should ne able to see the equality of the two paths, even though the two strings are different.
Assay File Reader fails when no CustomXml
was added with the Swate
tool
System.ArgumentException: The input sequence was empty.
Parameter name: source
at Microsoft.FSharp.Collections.SeqModule.Head[T](IEnumerable`1 source) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\seq.fs:line 1364
at ISADotNet.XLSX.AssayFile.SwateTable.readSwateTables(WorkbookPart wbp)
at ISADotNet.XLSX.AssayFile.AssayFile.fromFile(String path)
at <StartupCode$FSI_0005>.$FSI_0005.main@()
Stopped due to error
Describe the bug
I am working on a function to write swate tables from a extern data type. The function looks like this:
/// tables is the external datatype parsed to `Assay`
let assay = Export.parseBuildingBlockSeqsToAssay tables
let a = QueryModel.QAssay.fromAssay assay
let wb =
workbook {
for (i,s) in List.indexed a.Sheets do QSheet.toSheet i s
sheet "Assay" {
for r in MetaData.toDSLSheet assay [] do r
}
}
/// Parsing unit is not done correctly.
let fsSpreadsheet = wb.Value.Parse().ToBytes()
Most code is taken from here https://github.com/nfdi4plants/ISADotNet/blob/a06af930e4d3f9d3c49a7b07bb0496f927c4e6cc/src/ISADotNet.XLSX/AssayFile/Assay.fs#L188
If i try to write Swate unit columns, the final .xlsx file does not contain any numberFormat information. All cells have DataType
string, even though it should be something like this "0.00 \"unit\""
.
Image shows Cell.DataType and Cell.Value
If i convert my assay
to json with the following code, all unit information is still there, so i assume the information is lost somwhere in ISADotNet to SpreadsheetFs
/// tables is the external datatype parsed to `Assay`
let assay = Export.parseBuildingBlockSeqsToAssay tables
let parsedJsonStr = ISADotNet.Json.Assay.toString assay
// no unit information lost.
Describe the bug
A Study file that looks like this:
Source Name | Characteristics[c1] | Characteristics[c2] | Sample Name |
---|---|---|---|
src_1 | yes | smpl_1 | |
src_2 | yes | smpl_2 |
cannot be written correctly via the current API.
To Reproduce
#r "nuget: ISADotNet.XLSX"
open ISADotNet
open ISADotNet.XLSX
let createStudyProcess sourceName sampleName procName protName (characteristics: (string*string) list) =
let c =
characteristics
|> List.mapi (fun i (k,v) ->
MaterialAttributeValue.create(
Category=MaterialAttribute.fromStringWithValueIndex k "" "" i,
Value= Value.Name v
)
)
let src =
Source.create(Name = sourceName, Characteristics = c) |> ProcessInput.Source
let smpl =
Sample.create(Name = sampleName) |> ProcessOutput.Sample
let proc =
Process.create(Name=procName, ExecutesProtocol=Protocol.create(Name = protName), Inputs=[src], Outputs = [smpl])
proc
let s =
Study.create(
ProcessSequence = [
createStudyProcess "src_1" "smpl_1" "p1" "1" ["c1","yes"]
createStudyProcess "src_2" "smpl_2" "p2" "1" ["c2","yes"]
])
s
|> StudyFile.Study.toFile "test.xlsx"
Expected behavior
Writes correct study file
Actual behavior
Both samples get annotated via Characteristics [c1]
, Characteristics [c2]
is omitted:
OS and framework information (please complete the following information):
Describe the bug
StudyFile.Study.fromFile
returns some empty lists in the resulting study in record fields
.Protocols
.ProcessSequence
.Factors
.CharacteristicCategories
To Reproduce
Steps to reproduce the behavior:
StudyFile.Study.fromFile
Expected behavior
None
s instead of Some []
s.
Screenshots
The difference between the same study, once from the Study file itself (top) and once from the Investigation file (bottom):
Describe the bug
When using AssayFile.Assay.fromFile
and (Investigation.fromFile <path>).Studies.Value.Assays.Value.[i].Value
, the resulting assays are not identical even if they were initialized together in one ARC (e.g. via ArcCommander).
To Reproduce
Steps to reproduce the behavior:
arc a edit -a <assayID>
Expected behavior
Assay objects are identical.
Currently when an assay.xlsx file is read, the MaterialAttributeValues
(or Characteristics
) which are located as columns between inputs and output are assigned to both when reading the file. To reduce ambiguity when accessing the given characteristic or when again writing to an assay.xlsx file. The characteristic could also only be appended to the input.
Describe the bug
When creating an Assay and set several rows with the same Source Name, it should not be possible to set different Values for a Characteristic, but it is atm.
To Reproduce
Steps to reproduce the behavior:
MyCharacteristic 1
MySource
in the first value row for column Source Namevalue 1
in the first value row for column My Characteristic 1
value 2
in the second value row for column My Characteristic 1
(see screenshot below for steps 2 โ 6)Expected behavior
Forbid this (via, e.g., throwing an error or sth. alike).
A README
with additional information is needed. These could include
gh-pages
documentation if growing)Is your feature request related to a problem? Please describe.
https://github.com/nfdi4plants/ISADotNet/blob/a06af930e4d3f9d3c49a7b07bb0496f927c4e6cc/src/ISADotNet.XLSX/AssayFile/Assay.fs#L133
would definitely be more useful if it contained the name of the Assay. In ARCs with dozens of Assays, it is annoying to check every Assay file one by one to find where it crashes.
Is your feature request related to a problem? Please describe.
Assay and Study xlsx files will soon be containing more columns describing the underlying protocol. These will be:
Protocol Type
(+ Term Source REF
and Term Accession Number
)Protocol REF
Protocol Description
Protocol URI
Protocol Version
Describe the solution you'd like
Assay (and Study) xlsx file writers should be able to handle these new columns.
References
nfdi4plants/Swate#207
nfdi4plants/nfdi4plants_ontology#32
Describe the bug
The given string "string \"inside"\ string"
Can be correctly
handled by json deserializer if it is deserialized as a string
, but fails
if it is deserialized as part of an AnyOf
object.
To Reproduce
works:
"\"string \\\"inside\\\" string\""
|> ISADotNet.JsonExtensions.fromString<string>
fails:
"\"string \\\"inside\\\" string\""
|> ISADotNet.Json.Value.fromString
Expected behavior
Should result in
val it : ISADotNet.Value = Name "string "inside" string"
Additional context
Error message
System.Collections.Generic.KeyNotFoundException: An index satisfying the predicate was not found in the collection.
at Microsoft.FSharp.Collections.ArrayModule.loop@448-37[T,TResult](FSharpFunc`2 chooser, T[] array, Int32 i) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\array.fs:line 450
at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
at System.Text.Json.JsonSerializer.ReadCore[TValue](Utf8JsonReader& reader, Type returnType, JsonSerializerOptions options)
at System.Text.Json.JsonSerializer.Deserialize[TValue](String json, Type returnType, JsonSerializerOptions options)
at <StartupCode$FSI_0073>.$FSI_0073.main@()
Stopped due to error
I translated one of the existing Swate templates to the new preview format and used the new common api row major format with it.
In Assay the worksheet name was correct, but after parsing to RowWiseSheet it was wrong as 1SPL01_plants
was changed to 1SPL01
.
The problem lies in https://github.com/nfdi4plants/ISADotNet/blob/AssayFileIO/src/ISADotnet/JsonIO/AssayCommonAPI.fs#L98 .
I am currently working on two different versions of a fix.
The new library structure contains the intial set up for automatic docs generation.
With private access an how to can be found here: https://github.com/CSBiology/KnowledgeBase/tree/main/knowledgebase/devops/library-development
Currently when parsing s Swate AnnotationTable with ISADotNet to any ISA-JSON format. Ontology terms will be parsed as the following. Term Accession should be in uri format and can be created by combining the existing TSR and TAN with a static obo purl url:
"category": {
"characteristicType": {
"annotationValue": "Sample type",
"termSource": "NFDI4PSO",
"termAccession": "0000064"
}
}
Purl URL:
"http://purl.obolibrary.org/obo/"
Currently working on a PR to add this.
Hi there, i am currently testing this library in a project. There are a few things i have noticed (more issues incoming). The first one is very important imho.
How is this library intended to be used in a script? My naive approach without docs would be trying to build the object hierarchy from the ground up and then writing the whole thing to disk.
So like:
build my assay(s) -> add them to a study -> add that to an investigation -> save the whole thing to disk.
The very first step is very tedious, partly because of the forced usage of option (but this is already addressed in #24), partly because the parameter order is not pipeline friendly.
So for example, if i want to set the samples of an empty assay, this has to be done:
let assay = Assay.setMaterials Assay.empty (AssayMaterials.create (Some Samples) None)
Note that the assay is the first parameter of Assay.setMaterials
. I would suggest switching the parameters everywhere, so that the object that gets changed is the last one.
I would like to be able to do something like this:
let assay =
Assay.empty
|> Assay.setMaterials(
Materials.create(
?OptField1 = ... //ignore second field because i dont know its value by using optional parameters here
)
)
|> Assay.setData(
Data.create (...)
)
...
Or am i just not using this as intended? If so, how would it be intended?
The toOptions
function here is really nice to have. It should not only exist as static member but also as member this.toOptions
to improve discoverability.
When updating one record type with another using the methods in API.Update
, even when using the UpdateByExisting
option, a field with a filled record is replaced by a record where no value is set.
An example for this is the Measurementtype
field of the Assay
type
There should be a way to detect empty records and ignore them. Possibly using an option
for records in records
When trying to parse aggregated strings as used in ISATab/ISAXLSX format, two problems occur:
Even when given empty strings, one empty element is returned. But the result should actually be an empty list:
OntologyAnnotation.fromAggregatedStrings ';' "" "" ""
will result in
[{ ID = null ;Name = Text "" ;TermSourceREF = "" ;TermAccessionNumber = "" ;Comments = [] }]
When given multiple names but no other values, records with only names should be created. Instead it crashed because of the different input lengths
OntologyAnnotation.fromAggregatedStrings ';' "OA1;OA2" "" ""
results in
System.ArgumentException: The arrays have different lengths. array1.Length = 2, array2.Length = 1, array3.Length = 1 Parameter name: array1, array2, array3 at Microsoft.FSharp.Core.DetailedExceptions.invalidArg3ArraysDifferent[?](String arg1, String arg2, String arg3, Int32 len1, Int32 len2, Int32 len3) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\local.fs:line 69 at Microsoft.FSharp.Collections.ArrayModule.Map3[T1,T2,T3,TResult](FSharpFunc`2 mapping, T1[] array1, T2[] array2, T3[] array3) in F:\workspace\_work\1\s\src\fsharp\FSharp.Core\array.fs:line 309 at ISADotNet.XLSX.OntologyAnnotation.fromAggregatedStrings(Char separator, String terms, String accessions, String source) at <StartupCode$FSI_0041>.$FSI_0041.main@() Stopped due to error
It is very easy to add accidental whitespaces to any cells in excel.
These are then parsed as existing values. Add .Trim()
to all value access functions to remove these issues.
Reproduce
(whitespace) to it." "
Expected behaviour
I created a FAKE logic to automatically write RELEASE_NOTES.md according to all git commits.
This would be good to use, especially as release notes are part of the nuget package description.
I will add this after the PR modernizing library structure and before we release the new set of nuget packages.
Example Swate RELEASE_NOTES.md: https://github.com/nfdi4plants/Swate/blob/developer/RELEASE_NOTES.md
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.