simonaoliver / metageta Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/metageta
License: Other
Automatically exported from code.google.com/p/metageta
License: Other
************* License ************* See <install>/MetaGeta/license.txt ************* System ************* Runs on Windows XP or Vista (only tested on 32 bit OS). It has also been tested and mostly runs on 64 bit Linux (Ubuntu 9.04). However, there are issues with segfaults and read errors on HDF and NetCDF files on Ubuntu <=9.10. This is a GDAL/HDF4/NetCDF incompatibility - the UbuntuGIS gdal packages are built against a HDF4 library (libhdf4g) which includes an implementation of the netcdf api that is incompatible with the NetCDF library. See http://trac.osgeo.org/gdal/wiki/HDF for more info. The only way around it (currently) is to build your own HDF4 library from source and then link to that when building gdal from source. This should not be a problem on current versions of Debian or Ubuntu > 10.04 (lucid lynx) where gdal is/will be built against the new libhdf4-alt library ************* Installation ************* *Windows*: Download [http://code.google.com/p/metageta/downloads/list] the setup zipfile and extract and run the installer. The application contains all the 3rd party libraries and applications required. The MrSID, JPEG2000 and ECW drivers use proprietary SDK's, to add support for these formats, download the plugins setup zipfile and extract and run the installer. *Linux (or Windows where you wish to use your own Python and GDAL/OGR installations)*: Download [http://code.google.com/p/metageta/downloads/list] the source package or checkout [http://code.google.com/p/metageta/source/checkout] a copy of the source code, extract somewhere and see below: You will need the following libraries: * Python 2.7 and the following non-standard Python libraries: (if you wish to use Python 3+, you will need to port the application) * epydoc * lxml * openpyxl * pywin32 (obviously only on Windows) * GDAL python bindings (see below) * GDAL 1.6+ http://www.gdal.org * On Windows, use Christoph Gohlke's wheels [http://www.lfd.uci.edu/~gohlke/pythonlibs], OSGeo4W [http://trac.osgeo.org/osgeo4w] or the prebuilt binaries at http://vbkto.dyndns.org/sdk to get GDAL 1.6+ and the appropriate python bindings (Don't use FWTools, it only supports python 2.3.) * On Ubuntu, try the UbuntuGIS [https://wiki.ubuntu.com/UbuntuGIS] packages (or build from source if you prefer) * If you want to read ECW/JPEG2000/MrSID files, you'll need to install and link to the appropriate SDK (if building from source) or the OSGeo4W plugins gdal-mrsid [http://trac.osgeo.org/osgeo4w/wiki/pkg-gdal-mrsid] & gdal-ecw [http://trac.osgeo.org/osgeo4w/wiki/pkg-gdal-ecw]) or the see the UbuntuGIS scripts - gdal-mrsid-build gdal-ecw-build (located in /usr/bin by default if installed from the repos) and see - http://lists.osgeo.org/pipermail/ubuntu/2009-June/000054.html and http://trac.osgeo.org/ubuntugis/wiki/TutorialMrSid * You will also have to download and build appropriate libraries for other non-standard GDAL formats. * See http://trac.osgeo.org/gdal/wiki/BuildHints for more info ************* Usage ************* To use MetaGeta as provided, the first step is to crawl the filesystem to locate imagery and extract metadata to a spreadsheet. Writing to a spreadsheet allows for quality control, such as removal of records, checking for duplicates, bulk updates, etc. It also allows the addition of extra metadata fields such as contacts, use and access constraints etc. Information on additional metadata fields can be found here. If you don't like spreadsheets, it's quite simple to roll your own script that writes straight to XML, database, etc... Check out the API documentation for more info. You can run the crawler and transform applications by simply double clicking the runcrawler/runtransform.[bat|sh] files. Don't try to run any .py files directly unless you have set up your environment to suit. You can also run the crawler and transform applications without the directory/files entry GUI popping up by passing arguments to runcrawler/runtransform.[bat|sh]. Usage: runcrawler.bat/sh arguments Run the metadata crawler. If no arguments are passed, a dialog box pops up. Options: -h, --help show this help message and exit -d dir The directory to crawl -m media CD/DVD ID -x xls Output metadata spreadsheet -u, --update Update existing crawl results -o, --overviews Generate overview images --debug Turn debug output on --nogui Don't show the GUI progress dialog --keep-alive Keep this dialog box open The information extracted can then be transformed into XML. Currently only the ANZLIC Profile (ISO 19139) metadata schema is implemented for XML transformation, however more stylesheets can be added easily. Usage: runtransform.bat/sh arguments Transform metadata to XML. If no arguments are passed, a dialog box pops up. Options: -h, --help show this help message and exit -d dir The directory to output metadata XML to -x xls Excel spreadsheet to read metadata from -t xsl XSL transform {.xsl|"ANZLIC Profile"} -m Create Metadata Exchange Format (MEF) file --debug Turn debug output on Basic API documentation may be found here: http://metageta.googlecode.com/svn/trunk/doc/index.html
GDAL segfaults when reading NITF images containing JPEG2000 compressed
subdatasets.
GDAL 1.6.3 (OSGeo4W) with gdal_ECW_JP2ECW.dll plugin.
This issue has only been reported on 32-bit Windows Vista. Can't reproduce
on XP.
Original issue reported on code.google.com by [email protected]
on 12 Feb 2010 at 5:55
Transformed xml with mediaid field does not validate in GN
Compliance to metadata standard (XML Schema) (1 errors)
cvc-type.3.1.2: Element 'gco:CharacterString' is a simple type, so it must have no element information item [children]. (Element: gco:CharacterString with parent element: gmd:abstract)
Original issue reported on code.google.com by [email protected]
on 18 Jan 2011 at 2:58
The function that checks whether an existing metadata record needs updating
will always return True if the quicklook is stored with a relative path in the
spreadsheet.
Original issue reported on code.google.com by [email protected]
on 9 Jul 2012 at 2:36
Remove remaining .bat/.sh env vars (i.e CURDIR) so users with their own
python/gdal install and appropriate environment can just run the python code.
Original issue reported on code.google.com by [email protected]
on 9 Feb 2010 at 7:49
Build a proper pythonic setup.py
Original issue reported on code.google.com by [email protected]
on 19 Feb 2010 at 12:41
Download is missing Python26\Lib\encodings\mbcs.py
Original issue reported on code.google.com by [email protected]
on 16 May 2010 at 10:24
Add temporal extent element to ANZLIC XSL
Original issue reported on code.google.com by [email protected]
on 25 Nov 2010 at 11:00
Steps that will reproduce the problem?
1. crawl folder containing partially written ECW - typically tmp files with an
ecw prefix are located in the write directory.
Once incomplete file in encountered crawler suspends work
Ideal is for crawler to identify partially written file and skip to the next
file.
Original issue reported on code.google.com by [email protected]
on 11 Jun 2010 at 3:49
Incorrect histograms for 8 bit greyscale images as binsize is hardcoded to
be one but histogram min/max is not always 0,255.
Original issue reported on code.google.com by [email protected]
on 5 Feb 2010 at 2:18
Add ability to extract metadata from rasters in zip and tar.gz/bz2 archives now
that this is handled transparently (for some formats...) in GDAL.
Original issue reported on code.google.com by [email protected]
on 25 Jan 2012 at 12:20
When moving crawl results around you have to update the file location for
quicklooks and thumbnails in order for the transform to bundle the image in the
mef. Maybe the file path is not even required for the bundling to work and the
quicklooks are written with the xls?
Original issue reported on code.google.com by [email protected]
on 23 Jun 2011 at 6:53
Problem: gui popup prevents command line execution of runtransform
Example execution: runtransform -x
/g/data1/v10/metageta/rs0_FC/1986-08/FC25_V0.0_1986-08.xls -d
/g/data1/v10/metageta/rs0_FC/mef --nogui
Additional to this the xsl is missing in (source version?) @
metageta/1.3.9/lib/python2.7/site-packages/metageta/transforms/
Original issue reported on code.google.com by [email protected]
on 5 Jun 2014 at 4:37
What steps will reproduce the problem?
1. Insert MEF to Geonetwork
2. Validate sample record in geonetwork
3. Note 4 errors due to date omission in data quality fields
What is the expected output? What do you see instead?
Field is null - use date of metadata transform if no date exists
Original issue reported on code.google.com by [email protected]
on 6 Dec 2010 at 5:08
Include medianame in transform xsl to populate MediumName in TransferOptions
from xls spreadsheet. Defaults to DVD but should be able to accept TAPE and
other media..currently hard coded.
update documentation to include mediaid - http://metageta.googlecode.com
/svn/trunk/doc/files/transforms-module.html
Original issue reported on code.google.com by [email protected]
on 18 Jan 2011 at 12:35
Histogram statistics for overview image stretch may be generated from
subsample image rather than original scene at full resolution to increase
throughput.
Create overview image without stretch (scale image values to 8bit to
retain dynamic range)
Calculate values for histogram on overview
Generate overview and thumbnail
Original issue reported on code.google.com by [email protected]
on 21 Feb 2010 at 10:30
Add script to extract metadata from single images.
Something simple like:
{{{
import sys,formats
indataset=sys.argv[1] #better arg parsing required
outfile=sys.argv[2]
#some sort of arg to specify xls or xml output
#some sort of arg to flag overview generation
ds = formats.Open(f)
md=ds.metadata
qlk=ds.getoverview(outfile+'.qlk.jpg', width=800)
md['quicklook']=qlk
#write metadata record to xml/xls...
}}}
Original issue reported on code.google.com by [email protected]
on 20 Jan 2010 at 4:02
ValueError: row index (65536) not an int in range(65536)
ExcelWriter.__addsheet__ is called but the row count is not reset.
Original issue reported on code.google.com by [email protected]
on 9 Mar 2010 at 1:34
No startup errors reported to gui - e.g. failure to set GDAL_DATA
Original issue reported on code.google.com by [email protected]
on 28 May 2010 at 12:44
When updating a crawl result, create overviews when requested if a dataset is
not modified and does not already have overview images.
Original issue reported on code.google.com by [email protected]
on 23 Nov 2010 at 11:09
None of the following work:
./runcrawler.sh -d '/mount/dir/some dir' ...
./runcrawler.sh -d "/mount/dir/some dir" ...
Original issue reported on code.google.com by [email protected]
on 22 Jan 2010 at 5:02
No filelist or filesize values for ESRI Grids
Original issue reported on code.google.com by [email protected]
on 25 Jan 2011 at 4:16
Assigning a NoData value stuffs up assigning colour table entry to pixel value
assignment.
Original issue reported on code.google.com by [email protected]
on 21 Jun 2010 at 12:22
Enable checking of crawl results (reference existing .xls) during walk
process for new and modified datasets in target directory.
Results may be appended to the existing crawl result or created as a
new .xls - so that changes are maintained - depending on the users
requirements i.e. a checkbox 'append existing' or 'create new'
Original issue reported on code.google.com by [email protected]
on 21 Feb 2010 at 11:10
Current implementation is a complete kludge and hard to maintain.
NB: multiprocessing is only available in Python 2.6+
Original issue reported on code.google.com by [email protected]
on 8 Jul 2010 at 6:01
Standard pywin32 install puts a pywintypes25.dll in C:\Windows\System32. If
the build is different to that included in the MetaGETA package, a dll
conflict occurs.
To do:
Check if having a later version in C:\Windows\System32 than is included in
MetaGETA solves the problem. E.g try installing build 214
http://sourceforge.net/projects/pywin32/files/pywin32/Build%20214/pywin32-214.wi
n32-py2.5.exe/download
as normal and installing an earlier build to the MetaGETA python
site-packages dir.
Also, check what pywin32 build comes with ArcGIS.
Original issue reported on code.google.com by [email protected]
on 12 Feb 2010 at 1:10
Some NetCDF subdataset bands GetStatistics == [0.0, 0.0, -1.#IND, -1.#IND]
GDAL error is "Failed to compute statistics, no valid pixels found in
sampling".
Calling GetHistogram on such bands causes gdal to segfault.
Original issue reported on code.google.com by [email protected]
on 5 Mar 2010 at 4:57
The following error is generated when the GetArgs dialog is called by MetaGETA
apps:
{{{_tkinter.TclError: unknown color name "{#c3c3c3}"}}}
This is a problem with the version of Tix that ships with Ubuntu:
https://bugs.launchpad.net/ubuntu/+source/tix/+bug/371720
The workaround is to upgrade your version of Tix to 8.4.3 from the following
PPA:
https://launchpad.net/~portis25/+archive/ppa
Original issue reported on code.google.com by [email protected]
on 16 Jul 2010 at 1:57
Refers to <=1.2 usage
Original issue reported on code.google.com by [email protected]
on 11 May 2010 at 4:59
The GUIDs generated may not be backwards compatible with those generated by
MetaGETA <= 1.2. This is because those GUIDs (which are meant to be
reproducible based on filepath) were based on paths that may not be
constant - such as non-UNC and mixed case filepaths in Windows.
This will be an issue if we want to update existing crawl results. Though
I have implemented a little hack in runcrawler to work around this.
For example, assume U:\ is mapped to \\server\share and V:\ is mapped to
\\server\share\subfolder. U:\subfolder\test.tif is the same file as
V:\test.tif but a different GUID would be generated. Also, GUIDs based on
V:\test.tif are different to those based on v:\test.tif (and other case
variations).
The GUIDs from now on should be constant on Windows as they are "normcased"
and converted to UNC (I don't know enough yet about mount points and links
on *nix to implement something similar).
There may still be an issue with datsets stored on moveable media, etc. I
think we need to look into getting disk IDs.
Original issue reported on code.google.com by [email protected]
on 26 Mar 2010 at 3:54
The "updated" check only tests last modified date of the image, not related
files.
This stops runcrawler.py from updating metadata for images that have had
modifications such as adding a projection definition in an external file
(aux, ers, prj, etc...).
Original issue reported on code.google.com by [email protected]
on 12 May 2010 at 3:08
Run transform and examine results. Search for thumbnail reference. In other
examples thumbnail and quicklook references appear below maintenance frequency
info. Thumbnail and quicklook are included in mef - just not referenced so
appear null in GN view.
Original issue reported on code.google.com by [email protected]
on 24 Jan 2011 at 2:33
What steps will reproduce the problem?
1. runcrawler.bat <no args>
2. Browse to data dir
3. Browse to existing xls
4. Delete .xls and copy rest of path
5. Paste rest of path as shp and log files
6. error occurs in "for ds in Crawler"
Original issue reported on code.google.com by [email protected]
on 12 Mar 2010 at 5:03
16 bit record length in xls may be limiting - openpyxl supports larger number
of output records xlsx. For large inventories this would be an advantage.
Original issue reported on code.google.com by [email protected]
on 5 Jun 2014 at 1:23
In runcrawler.py the option for arg -r is set to True as default.
this precvent it to be used as if you omit the -r in the command line it takes
the default value of True. Itf you insert the -r in the command line it is
again set to True.
I think line 326 should be patched from
opt=parser.add_option("-r", "--recurse", action="store_true",
dest="recurse",default=True,
to
opt=parser.add_option("-r", "--recurse", action="store_true",
dest="recurse",default=False,
Regards
Stefano
Original issue reported on code.google.com by [email protected]
on 30 Jul 2012 at 12:29
It seems like the gdal release included in MetaGETA 1.3.6 setup was compiled
with MySQL support but libraries was not included.
Steps to reproduce issue:
1) run metageta-shell.bat
2) at prompt run gdalinfo
3) System shows error message added as screen capture (sorry it is in italian
as my system setting but it should be clear enough).
Original issue reported on code.google.com by [email protected]
on 30 Jul 2012 at 8:53
Attachments:
[deleted issue]
Don't close dialog box if "keep open" is checked when Ok is clicked on the
progress logger.
Original issue reported on code.google.com by [email protected]
on 20 May 2010 at 1:43
overviews.stretch('NONE', ...) truncates data > 8 bits instead of rescaling
Original issue reported on code.google.com by [email protected]
on 22 Feb 2010 at 4:19
Add an option to export directly to XML/MEF, skipping the XLS spreadsheet.
Original issue reported on code.google.com by [email protected]
on 8 Feb 2010 at 5:21
Enable crawl of hard media in sequence without crawler shutdown
Ask user at the end of crawl if they wish to "crawl another?"
Provide text box for user to enter media id (default blank) - the values
entered here should persist following completion of the crawl for easy
update by the user. i.e. if user is cataloguing a sequence of CDs 101 to
200 etc.
For tranform: -
Include MEDIA ID in supplemental information (or preferrably in
appropriate metadata field i.e Offline Media - offlineMed "Name of media
on which the resource can be found") Include the MEDIA ID in the ABSTRACT
for searchability in Geonetwork.
Original issue reported on code.google.com by [email protected]
on 18 Mar 2010 at 4:09
Support vector data.
Original issue reported on code.google.com by [email protected]
on 8 Feb 2010 at 5:23
I'm not able to run the crawler in 1.3.6 release, no issues with 1.3.5
1. run metaggeta-shell.bat
2. from prompt run runcrawler.bat
3. python interpreter cmplains about import of formats modules , see attached
picture for details.
Regards
Stefano
Original issue reported on code.google.com by [email protected]
on 30 Jul 2012 at 9:02
Attachments:
Modify runtransform.py to use the new GetArgs class. Will require adding a
DropList type to the getargs module.
Original issue reported on code.google.com by [email protected]
on 31 Mar 2010 at 4:22
What steps will reproduce the problem?
1. Run crawler in update mode on an existing xls which contains records for
datasets that have been deleted.
Original issue reported on code.google.com by [email protected]
on 23 Nov 2010 at 4:37
*What steps will reproduce the problem?*
1. Run crawler and generate a new XLS
2. Run crawler over a folder that contains a tif with a non-ascii character
in its metadata, updating the above XLS
3. Profit???
*What is the expected output? What do you see instead?*
UnicodeDecodeError: 'ascii' codec can't decode byte 0x90 in position 87:
ordinal not in range(128)
Original issue reported on code.google.com by [email protected]
on 29 Apr 2010 at 6:22
Add support for Pleiades HR DIMAP V2.0
This will be added once GDAL supports it -
https://trac.osgeo.org/gdal/ticket/4826
Original issue reported on code.google.com by [email protected]
on 26 Sep 2012 at 5:38
To reproduce:
* Lock an existing shapefile (have it open in ArcGIS or another MetaGETA
process)
* Call runcrawler with the --gui arg and -s = above shapefile.
* runcrawler process crashes, progresslogger process keeps waiting.
Original issue reported on code.google.com by [email protected]
on 19 Feb 2010 at 1:05
Running crawler/transform repeatedly from the commandline results in a "The
input line is too long." error
Original issue reported on code.google.com by [email protected]
on 19 Feb 2010 at 3:12
Overview generation from byte ESRI binary grids fails
Original issue reported on code.google.com by [email protected]
on 22 Jan 2010 at 12:28
NoData values are not always set on self._gdaldataset bands. This skews
stats and histograms causing poor overview image quality.
Original issue reported on code.google.com by [email protected]
on 5 Mar 2010 at 3:56
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.