Difference between revisions of "2017 Winter Project Week/IPFS NoSQL Combination"

From NAMIC Wiki
Jump to: navigation, search
(add Satra)
 
(One intermediate revision by the same user not shown)
Line 38: Line 38:
 
* Discussed interesting deduplication idea: Can original DICOM, anonymized DICOM and e.g. Nifty files share data blocks?
 
* Discussed interesting deduplication idea: Can original DICOM, anonymized DICOM and e.g. Nifty files share data blocks?
 
** Answer: Yes, there's a <code>--chunker</code> argument to <code>ipfs add</code> which [https://github.com/ipfs/faq/issues/214 defines the chunking algorithm]. Apparently, the rabin chunker should already perform well, without any particular knowledge about the file formats, but it does not seem to be the default algorithm. (Valid arguments include: 'rabin' 'rabin-[avg]' or 'rabin-[min]-[avg]-[max]' with integer parameters.)
 
** Answer: Yes, there's a <code>--chunker</code> argument to <code>ipfs add</code> which [https://github.com/ipfs/faq/issues/214 defines the chunking algorithm]. Apparently, the rabin chunker should already perform well, without any particular knowledge about the file formats, but it does not seem to be the default algorithm. (Valid arguments include: 'rabin' 'rabin-[avg]' or 'rabin-[min]-[avg]-[max]' with integer parameters.)
 +
** Also pay attention to development of IPLD (for example: [https://github.com/ipld/specs/tree/master/ipld#format-definition look here])
 
|}
 
|}
  

Latest revision as of 14:56, 13 January 2017

Home < 2017 Winter Project Week < IPFS NoSQL Combination

Key Investigators

  • Hans Meine (University of Bremen, Fraunhofer MEVIS = FME)
  • Steve Pieper (Isomics)
  • Satra Ghosh

Project Description

Objective Approach and Plan Progress and Next Steps
  • Evaluate IPFS / NoSQL combination for MIC databases
  • Evaluate IPFS' PSK feature for "private clouds"
  • Build prototype for scanning images
    • put images / files into IPFS
    • put metadata into NoSQL database (ElasticSearch is what we used at FME, CouchDB is what Steve used in Chronicle)
  • Build prototype for browsing / showing images
    • should update live when images appear in the DB
    • should fetch image data from IPFS
  • IPFS stability and status
  • Performed several experiments that showed disappointing performance (maybe MIT wifi related?), but eventually successful transfers
  • QmPyXW927iBPHVk3hfyzXAPGDpup26WGEh4LYK6da2xMhA is a TCGA subdirectory transferred to several project week participants' computers
  • Discussed interesting deduplication idea: Can original DICOM, anonymized DICOM and e.g. Nifty files share data blocks?
    • Answer: Yes, there's a --chunker argument to ipfs add which defines the chunking algorithm. Apparently, the rabin chunker should already perform well, without any particular knowledge about the file formats, but it does not seem to be the default algorithm. (Valid arguments include: 'rabin' 'rabin-[avg]' or 'rabin-[min]-[avg]-[max]' with integer parameters.)
    • Also pay attention to development of IPLD (for example: look here)

Background and References