Interplanetary Adventures with IPID

Speaker 1 00:00:06 Welcome to the rubric. I'm your host, Joe Andrew Speaker 2 00:00:10 I'm Erica. Speaker 3 00:00:12 And I'm Eric shoe today on the rubric. We talk with physician developer, Johnny crunch about how IP ID fits into the ecosystem and how it came to be, Speaker 4 00:00:23 Which systems do you trust to communicate, to get to the ground source of truth? Speaker 1 00:00:30 IPI D is the did method based on IPFS, the interplanetary file system, the leading decentralized file storage system using IPFS as it's verifiable data registry, IP ID doesn't rely on a blockchain instead. It provides a simple and convenient way to use IPFS to resolve an IPI D did to its current did document Speaker 2 00:00:56 In the real world. Johnny crunch is a triple board certified physician with training in internal medicine, medical, genetics, and biomedical informatics with previous academic appointments that include Stanford university and Vanderbilt university. He's also a serial entrepreneur and seasoned executive having roles as chief medical officer, chief information officer and chief medical informatics officer at various startup companies. He is also a proficient programmer and is the author of the IPI D interplanetary identifiers did method. He has been involved with the IPFS interplanetary file system for the past five years and is a regular contributor. Johnny, welcome to the show. Speaker 4 00:01:42 Thanks for having me Speaker 1 00:01:44 Start off with, with the big, easy question, Shawnee, what is did IPI D Speaker 4 00:01:50 So it's based on, uh, IPFS and, and I think they're probably have to dive into a little bit more of the detail of what IPFS is, uh, IPLD interplanetary link data and why all of the interplanetary. And so the, the basis of it, of interplanetary really has to do with the you address stuff based on its content, not its location. So in essence, if it works fine on here, it'll work fine on Mars. So it really is about the stand-alone nature of the content and the merits of that content. Not about who sharing it or where it is, it's really about what is it. So IP ID is really just a upon on the IPFS. Obviously it's built upon IPFS, but it's a really elegantly simple solution to a did method, uh, that of course leverages the IPFS infrastructure. So it really is as simple as publishing your did document to the IP N S the interplanetary naming system, which is part of the IPFS infrastructure. Speaker 3 00:03:00 So IPFS the interplanetary file system is a file system, as you just said, uh, that index is based off the content of a document, which means that every update to a document would imply a new identifier. Speaker 4 00:03:17 Yeah, exactly. So it's hash based content addressing. So if the content changes, the hash will change and the IPFS is a distributed system for storing, accessing files, websites, applications, and data that is decentralized. There's no central point of failure and therefore it's also sensory system resistant. Speaker 3 00:03:39 Okay. And then there's two other parts, one of which you just touched on the IP S interplanetary namespace, and, uh, IP L D, which we haven't referenced yet, but entertained interplanetary linked data. Is that right? Correct. Yeah. Maybe mine just digging into these three IPFS IPNs and IP L D a little bit more and their relation. Speaker 4 00:04:03 Yeah, probably the IP NS, the namespace. And so the namespace is, uh, the part of the infrastructure I would IPFS lower level of IPFS is a summit called lid P to P or the peer-to-peer networking layer, which is a sort of a wave three or four layer of peer to peer brought up broadcasting. So the namespace really is about the public key. So you're publishing content to a value, a key value store in as a DHT, a distributed hash table. And that is the record. That record system in IPFS is the key values. The key in this case is the public key or the hash of the public key. And then the value is what you're putting there. And that would be, uh, in this case, the value of the document or the hash, the CID, and then, so those hashes are a self-describing, so that those are using something called multi hashing, multi base, and multi-cloud X. Speaker 4 00:05:02 So it's self-describing. So it's pre pended by bytes that are lookup table that says, um, what follows is a shot 2, 5, 6 hash it's encoded in, uh, a base 56, which is the Bitcoin default, uh, base encoding representation. And what this content is, is a raw data. It's it's binary, for instance. And it's basically also versioned, and this is a version, one of a, of a codec that describes it. So if you get that hash and you know, that it is a CID, then it describes, um, batteries included what you need to do to decode what this information represents. Speaker 1 00:05:47 So how does the link between an IPNs? So when you look up, uh, I take it, is it just called the name and the IPNs, um, do you get back then a content address that you could then resolve or IPFS? Is that sort of the flow? Speaker 4 00:06:02 Yeah, so that's in the record system. So that's part of it is actually in the DHT that could definitely, uh, uh, routing is that basically it's a, uh, buckets of, of storage of key value pairs. In this case, the record includes the signature that cause obviously you need to only trust the issuer of the, uh, key that this is associated with. So the public is it's published to the hash of the public key, and the record includes a signature that validates that this record is signed by these holder of that private key, and that can be easily verifiable. And that is the naming system, which is the key value pair. The key is the hash of the public key, and the record includes a signature associating that record. And then the content is the value that you actually push published there, uh, is, is the value Speaker 1 00:06:58 And how does file coin work with all of this stuff? Speaker 4 00:07:02 So, final coin is the incentive layer, and then you thought this, so you don't need to have file court necessarily. So you, if you were to publish this, you need to have a place to store this and resolve this. So that gets into the broader system of IPFS, the interplanetary file system of this pending nature of the content. So the hash of the, uh, content, the CID exists, but it needs to be pinned. You need to support that. You need to basically replicate it, pin it across different nodes in the network, or have someone replicate it for you on your behalf. And if at the file coin comes into play, which is the, the incentive layer that you don't have to do it yourself. You can basically pay someone else to do it for you in a way this distributed or decentralized. Speaker 1 00:07:47 I see. So if I can host it myself, then I don't need file coin. But if I would like to leverage the file coin as a mechanism, then I don't need to host it myself. File coin gives me an easy way to pay for hosting to get the, uh, storage of the underlying file secured for some period of time. Speaker 4 00:08:05 Exactly. And there's an entire ecosystem in the marketplace for a trades of that nature, a storage network, but the IP ID does not necessarily rely on that. That is actually is a, um, a mechanism to incentivize or to have a decentralized way of having people be encouraged, or if you can, uh, marketplace a payment for that, uh, did documents, there actually are, are relatively, quite small, at least right now with about one KB on average. I sure they can grow as we get larger and larger, uh, public keys or to maintain all your list of your keys, but a one K D is, uh, would be trivial to store on if you were to pay for it with a file play. But it's probably important to note that this is actually not part of the IP ID, um, uh, system. You could, you could do this completely on your own, or there are pinning services out there, including Kenyatta, for instance. Um, and I think it's, piñata dot IO who will paint up to one gigabyte of files for you for free. Speaker 3 00:09:10 Alright, could you walk us through the standard life cycle of a did IPI D first, how you would create it and the associated document and then how it gets published to IPFS and subsequently updated? Speaker 4 00:09:24 Yeah, so the content of the document is up to the did author, and this is something that swiped slightly different than the other did methods, so that you would have to have software to generate the document, uh, to create the right structure of the required fields. That would be like the ID section, obviously the public key section, the, uh, the other relative, um, uh, pieces of that. So it would actually create that on your own. And then once you actually have that, as let's say, as a Jason file, a JavaScript object notation representation of that did document in ID ID. It's as simple as let's say, you can cat that or echo that into the command line. If you know, know how to do that, basically you're piping it into like from the terminal and you pipe it into IP Fs, and there's two different ways. Speaker 4 00:10:17 Um, the preferred way is to say you Kat the file that Jason file and you pipe it to IP Fs, dag put, and that puts the Jason file into the format. Serializes it, um, canonical performs a canonical representation of the Jason file and then creates the CIDP output of that would be the CID. And then you take that CID, that content identifier, and you publish it to your IP N S namespace. So you do IP F S name publish, and that CID and that in a nutshell is creating a IP ID, did document and publishing it to me as an IP ID did nothing. Speaker 3 00:11:07 So you generate the document yourself as the first step and then using either echo or curl, but presumably there would be software libraries out there that you could use as well to get the CID as content identifier, which is then published to the namespace of the interplanetary namespace. And that's essentially it, Speaker 1 00:11:27 I got something lost in there. So we create the document. I get that we generate the CID, I get that, but I'm getting lost between the publication to the IP and S which is secured by some public key. And then I think it's separately going to IPFS to actually be stored on the network. Am I following or I'm missing? Speaker 4 00:11:48 Yes. So, um, you'd actually need to have it, uh, so you're publishing it to your IP N S which is basically creating a documented record that includes your cryptographic signature, that gets stored into your repository, your local repository that runs IPFS now to propagate it into the network, you need to make it discoverable. You need, like, let's say we want to pin it across seven continents, or we're going to use file coin to incentivize someone on each continent to actually to pin it and make it available. The, that, that basically publishes that, that record into the DHT, but distributed hash table to make it discoverable. And so it only will be discovered if it's queried and someone goes to look for it in the routing of the condemned ULA, uh, routing, which is then a Bitwise exchange would occur of actually retrieving that document. So, but it, the DHT is a key value pair that actually is if you're dealing with condemn Mila, uh, K buckets is basically in that in a distributed fashion, the certain number of peers hold certain number of key value pairs in memory as part of the distributed hash table. Speaker 4 00:13:07 So it's just the, the key in this case is the pub the hash of the public key. And the value is that CID that represents a pointer to the document. Speaker 1 00:13:17 So I think I follow that, but then don't, we still have to publish the actual document to IPFS. Speaker 4 00:13:23 You make the public, the, the document available on your computer system, on your network of computers or your, let's say your, your systems that have it pinned. And it's discoverable when you go to retrieve it. Speaker 1 00:13:36 And as the does that, making it available on my server via IPFS, that happens before I published the IPNs or after correct before before. Speaker 4 00:13:46 Right. Okay. Do you have to correct you to create the document and, and you saved the document is so let's say adjacent file. You put it meaning you serialize it, you cannot analyze it cause you have to do it to a specific order it's deterministic and that saves it to your local IPFS node. And then you publish that CID, the output of that file to your IP N S uh, record system. And then if your notice online, it's discoverable, and depending on which nodes you're talking to, um, these, this could be a public network, or it could be a private network, or it could be a network that lives on Mars. So the, the Juju, the magic Juju of distributed nature is that, which systems do you trust to communicate with, to get to the ground source of truth, but this distributed fashion of it is that it will work perfectly fine on Mars as it does today. That's the interplanetary system. Speaker 3 00:14:53 So essentially you generate locally and then once you're ready to publish, you can choose which, uh, I IPNs to essentially publish your record to. And anyone then that can get to that namespace can look up where your document is and get routed to your actual document. Speaker 4 00:15:10 Yeah. So it's important to actually the key piece of this underlying infrastructure is the peer-to-peer nature of the communication and those there's multiple different protocols of the communication of nodes talking to nodes each, each one's up here, there, uh, there's no such thing as a client server or master's slave in, in, in this sort of model. It's everyone in this network as a peer and you default to communicating to a certain number of peers. And there's a list of about 20 of them that actually are bootstrap nodes. And those bootstrap notes aren't actually IPFS. Those are live PDP nodes. And as you're probably aware, um, like the two dot O protocol is moving towards using hippy-dippy as the underlying chat or the gossip protocol of the gossip. That that's good enough. So depending if you can communicate and you can bootstrap with a list of known peers or publicly published peers, or the default peers that are actually in the system, but you can edit that list and actually communicate with the only ones that the ones that you trust or a certain segment of the ones that you trust, or it could be a private network of peers, or it could be done over carrier pigeon or through the protocol of communication that the lid P to P doesn't matter. Speaker 4 00:16:33 There are obviously a TCP IP communication. There's a, a TLS, I think a one dot three now is implemented. There is, uh, the Google noise protocol. Uh, you can do web sockets. You can do carrier pigeon, if you actually like, or you can do satellites, if you can find Speaker 3 00:16:53 Any of those. Speaker 4 00:16:54 Yes. So you can actually be, I know they actually, we did do at that IPFS camp actually did an experiment with a satellite dish of communication. And, uh, that's mentioned, uh, Bluetooth is actually, you can communicate over IPFS over Bluetooth so that the protocol of exchange or discovery of the network is there's multiple different streams of protocols to communicate one peer with another peer. But the nature of it is that that's all modular. And so that however, you actually want to exchange information. And also, however, he actually wants to wait for a delay for response. So if you actually wants to, let's say, if it's on Mars and you know, it's going to take 30 minutes for it to go round trip to Mars, and you're willing to wait that 30 minutes, then you set up a satellite dish. You have a protocol that speaks satellite, and you have a node that's sitting on Mars, and you're waiting to communicate with a node that's at Mars that you would know is trusted. The only way that you actually can get a response back is that it's encrypted. And the traffic is communicating, being sent from the pier that has the right key signing traffic from the nodes is actually on Mars. Speaker 1 00:18:09 I'd like to clarify, um, how did I P I D is secured given what you just said about arbitrary communication channels at the network layer. And I think, I know I want to present a hypothesis or a theory. I think the way it's secured is the public keys in IPN and S are what secure the link between a particular identifier and a particular content identifier, which you then get via IPFS, but over IPFS, it's all peer to peer. And there isn't a particular mechanism to secure access to any particular piece of content. Speaker 4 00:18:45 So everything is done by a public private key cryptography. So if the you're the IP, going back to the IP RSC in a printery record system, that key value pair, it has to be signed by the private key associated with the hash of the public. So that that record is only valid if it actually has a valid signature. So that means that, uh, as part of this, the hash of your irrigator, you did method specific identifier is the hash of the depth that private key, whereas I'm sorry, the hash of that public key, that signs that record, at least that has to be one of the records. One of the keys that has to be in order to do a fully validated, uh, round trip that the content of a key in your public key array includes one of the keys that actually is signing the, the, the key, the key for your record associated with the IP ID, but there are other systems out there. So the default right now is, has moved away from RSA, uh, to , which is the default key now, and the IPFS infrastructure, but the lip PTP, um, given the nature of Ethereum two dot O uh, is also supported in the lip PTP infrastructure. And I know at least I think poker dot system is actually using the sec P 2, 5, 6 K one curve to do records signing, et cetera. So that should be on the horizon. Speaker 3 00:20:17 It seems like the security of any did IPA D is only gonna be as good as the trusted nodes. As far as, as an individual, I can trust a set of nodes within the IPNs space, which would seem to imply that for, um, especially bespoke networks, like a large corporation that wants to set up a sane IPFS system internally that could provide high levels of security. But once you go public, you start to have to make a lot of the same choices as you do, and say like the DNS type of world. Speaker 4 00:20:51 Yeah. Although it all boils down to the validation of that cryptographic signature. So when you think of you're speaking to is what's called an eclipse attack. And so the eclipse attack is that you basically segregate a network, uh, and that you're unable to, or the individual who owns that public key is segregated from the rest of the network. And so you're fooling your, uh, going back to the validated, the fluid, the validators. So there's Alison Bob, and then you have to be the validator. Who's actually trying to, let's say the TSA agent is trying to validate a signature. So you're trying to segregate the validator from the holder and prevent them from updating a record, but to giving a name, you have to do that pretty large scale to disrupt a DHT. So the way the DeMille routing table works is something called K buckets. Speaker 4 00:21:44 And so it's using the edit distance or the, um, bike wise, uh, distance, uh, that jumps from one K bucket to the next K bucket. And you get closer and closer and closer to the person who actually holds that content in this case, the hash of the public key. So you need to know the topology of your network to have a way to have a Siri would be to segregate that network. That's very difficult to do. And that's beyond my understanding as far as the cryptography or networking engineering that would be required other than it's very, very hard. Speaker 3 00:22:18 And the difficulty only increases as the network size increases then. Speaker 2 00:22:22 Yeah. Johnny, you're a triple board certified physician. I'm interested in how, but maybe, maybe more importantly, why, why, why did you get involved in program? Speaker 4 00:22:34 Yeah, so as part of my master's program and my master's thesis was to do biomedical informatics. So my background mostly is in, uh, bioinformatics. So, uh, mirroring the, my experiences, a geneticist, uh, with the need for informatics is doing next gen sequencing, alignment, um, uh, interpretation of reports, um, the human genome project, et cetera. So, uh, I went back to school and it got my masters in biomedical informatics, um, and the taxpayers thank you very much, uh, paid me to go back to school. And, uh, it took a huge pay cut from being a doctor to taking classes with 18 year olds and see us one-on-one learning Java. I had had some programming experience before in medical school and mostly JavaScript and HTML. Um, and then, uh, learned Java C plus plus, and then PHP then Python. I love Python. I still use Python every day and then, uh, go and rust lately or my other two favorite languages. Speaker 4 00:23:37 But so I did, um, in both in medical informatics, like I mentioned as that informatics is really about creating methods, not just about storing information, but deriving new insights from the information. So in bioinformatics it's really is about writing new pipelines for a sequence alignment and interpretation of the human genome genome, which is very computationally intensive. So the projects that I worked on and, uh, my master's thesis and when I was in faculty was in bioinformatics and the merger with, uh, family history. So as a clinical geneticist, uh, family history is really important for the phenotype information. So how do we, uh, represent your family tree, your pedigree, for instance. And that was a part of my master's thesis, but after I left academia, I was interested in, um, uh, genetic sequencing. So, uh, myself and the founder created a sequencing lab. Uh, so we started off in cancer testing deciding, uh, treatment selection for chemotherapy, uh, based on the tissue tumor profile, uh, that got to be just too expensive. Speaker 4 00:24:44 And so we focused on, uh, uh, do mostly from a startup standpoint of raising capital in order to go through the whole FDA approval for our sequencing lab. So we decided to focus on, uh, genetic sequencing of food and particular dog food. So we still actually support, uh, sequencing of dog food for food fraud. And as part of that, I had to, I did all, I read all the bioinformatics algorithms. So we have a novel approach to sequencing that required a novel pipeline. So I wrote that from scratch, using a Python, and then, uh, but it got me interested in, um, IPFS only because I had to manage a fifty-five thousand genomes in our, in a database for the startup company. And it got me thinking there's gotta be an easier way to store this information. So I started writing my own, which was written in, um, for an MD five hash. Speaker 4 00:25:40 If you're familiar with that only you realize that there's someone else's gotta have done this. And so I discovered IPFS. So this is about 2015, and ultimately the problem that we were trying to solve with, um, the food fraud detection was really about testing that someone in the pipeline had done testing. So it got me interested in blockchain solutions to solve this. And so, and this is 2015, 2016, uh, Bitcoin blockchain, uh, theorem was just brand new at that time. So you really got me interested in presenting a document to the supply chain that proved that genetic testing had been done and validated and on some sample of the, the payload. And so we built that that was in 2016, but really, um, started off my interest in the IPFS project, uh, that these were really smart people. And I, I started contributing and, uh, attending the weekly calls and talking about use cases and really took a deep dive into, and then into blockchains and Bitcoin cryptography, um, after that. Speaker 3 00:26:49 So out of all of that, um, obviously you started working on did IPI D. So how, I guess, what was the use case that really drove you into, uh, needing this new identifier? Speaker 4 00:27:03 So I was at the, um, the ONC, the office of the national coordinator meeting for this. There's a white paper challenge from the, this is a health, it, the, the government had a white paper call for what were the use cases of blockchain in healthcare. And at that conference, this is 2016 or 17, I guess. Um, one of the people who presented was a Drummon Reed, um, who, uh, obviously is in the sovereign and, um, indie world, as well as a man who spoke scrutiny. And, uh, it talked about this idea of decentralized identifiers and just piqued my interest as far as this is like, where I think an interesting field that five book credentials would meet the needs of what I had in the, uh, food fraud, uh, document delivery, blockchain solution that we were building. So I attended that meeting and, um, uh, was intrigued by decentralized identifiers. Speaker 4 00:27:53 I actually, uh, that three months or two months later actually, uh, to participate in a hackathon. So, uh, you mentioned I'm a programmer. Um, and one of the I'd love to do is attend hackathons. It's just, um, really to challenge myself, to do a deep dive. And so I attended, uh, a hackathon at the distributed health conference in Nashville, Tennessee and I, the weekend hackathon, if you're familiar is like a 24 hour stay awake code as much as possible and present what you've worked on. And I had hoped to work on, uh, a solution for verifiable vaccinations, uh, and really cause at the time was the, uh, hurricane arena, I think has just gone through Florida. And the guard in the state of Tennessee had suspended the law's practice for the practice of medicine, and will allow any doctor from any state to take care of any patient, if they're, they were a refugee from Florida during the hurricane. Speaker 4 00:28:52 And so that was the weekend. That was actually the weekend of this hackathon. And so the got me thinking as far as like one of the sort of hello world of healthcare and healthcare, it is vaccinations. And so back when I used to teach, uh, students, medical students, uh, the introduction to biomedical informatics course, when I was on faculty, I could create creating an, a proof of vaccination or just basically a table of your vaccinations was like the hello world that we actually used to teach the students. So that weekend was a hello world of like, basically let's do verifiable vaccinations as a verifiable credential. And as part of that, I needed, I did method. And so I downloaded installed, uh, project indeed, uh, which was now, now part of the Hyperledger project, but I was frustrated because it didn't do what I needed to do. Speaker 4 00:29:39 I just needed to, I did method quick and simple to do my verifiable vaccinations to show the hack, what I did over the weekend. And in the end, um, it was at the time was very simple. Um, hello, Alice? Hello, Bob. And the behind it was, uh, he used a plenum for the, um, the fault tolerant, uh, the, the, the BFT, um, as a team fault tolerance, um, consensus algorithm. And, uh, and I, I, I looked at the code and realize what they're trying to do is really about this DHT, this, uh, uh, uh, distributed hash table. And I scratched my head and said, oh, yeah, I seen this before. This is basically ITRs or literary record system. And so instead of writing or trying to understand, like the messy code, I, um, basically just realized I can just publish this to IPFS and call it a, uh, an IP, um, uh, a did method and did that over the weekend, demoed it. And, um, now obviously verify, but credentials are kind of a big deal as the use case for, um, for vaccine where vaccines are driving use case for verifiable credentials, but then a few weeks later, I flew out to Boston and presented the work to the recruiting rubber trust meeting. I think it was number seven, that was in Boston. That was in 2017, Speaker 1 00:30:59 Was involved. Johnny, was this mostly you as a lone ranger, or did you have some allies and Confederates? Speaker 4 00:31:04 Uh, just me for that weekend. Um, the, there was a whole bunch of projects that we could have done. I, um, was busy that day and actually, honestly, didn't do the whole 24 hours and getting old and not willing to stay up. And so, uh, I, so I hacked that together and that was just me that again, Speaker 3 00:31:21 Are there people you're working with now, uh, continuing the IPD work, um, or who is continuing it, uh, now and going into the future? Speaker 4 00:31:29 Yeah, so it really is, um, stands on its own. And so it's not like it's not supposed to be controlled by any, uh, company, uh, or blockchain. So it's, as I mentioned, blockchain agnostic. So you can use blockchains with IP ID if you choose to, they actually want to have added layer of security, uh, but it doesn't have to. So there are companies that have, um, uh, licensed some of the code that I have created to actually support it. There are people who've created their own methods or own software on top of IP ID. I think I mentioned, um, Andre Cruz, um, from, um, uh, the labs, the matter labs, I guess, and, uh, Portugal. Uh, then there was Alberto, um, who was, uh, attended the conference. One of the conferences with us in the recruiting Weber trust. Uh, he's working on a distributed VR platform. They wanted the lightweight did method, and there is, uh, some people in Australia who've, uh, licensed to, to for distributed, um, digital rights management, uh, software. And then, uh, there are other companies that actually are up with, um, the, more of a Walnut sort of a solution centered around key management, because in the end, uh, all of these didn't methods, we really require a wallets and a key management software. Speaker 1 00:32:49 So talk to me about the wallets given that you talked about curl and other sort of Linux command line D ways to interact with IPFS. I'm guessing you're using command line, key generation tools at this stage is the next step having, or getting this supported by sort of more gooey like wallets. Speaker 4 00:33:08 Yeah, you can. And so there are some, um, web-based, uh, key management software. Um, so it gets, it gets into a slippery slope though, because, um, I key management of Bitcoin or Bitcoin wallet, or your Ethereum wallet, or now your identity, uh, it gets into the tricky business of maintaining the security around that. And I don't know about you, but I'm pretty paranoid about my Bitcoin and Ethereum, and, and now my file coin wallet, as far as like how to keep that private. So I think a lot of this, uh, did methods, are you really going to be set around that, that keeping that quality secure, uh, there are methods in IP F S uh, so right now, most of that goes into a, a dot IPFS file. And so, but that's up to the user to actually restore that, um, to do some magic, to switch around the keys. Speaker 4 00:34:00 So that basically, it's not in the clear in your dot IPFS, although I think that the latest version, the.zero dot nine version of IPFS actually has that now, uh, encrypted by default, I should probably mention that IPFS is still in beta. So IPFS is still, uh, version zero dot nine, uh, as of, uh, this week, I believe so that there's still a lot of lessons learned and still not quite ready for like the first release, but the PR the proper way, in my opinion, would be a, either a hardware wallet or a cold wallet, um, that basically you perform the, those signatures in an offline device. You know, I, I've got to the point where I have a whole different laptop that I actually use that's offline. That doesn't have any of my, when I do any of my transactions. And, uh, like, uh, at these Christopher Helen's taught me that you keep a cold offline wallets and it'd, you should be sufficiently paranoid. Speaker 1 00:35:00 Yeah. My big fan of that approach. Speaker 4 00:35:01 Yeah. And so when it gets into like, well, what's your level of paranoia, or what's your level of security, and that's up to you to decide, uh, how well you either wants to, uh, encrypt and backup your keys or perform an offline signature, uh, for your did method for your meaning your Dick document, the IP ID is we're not controlling your keys. We're not controlling creation of the document. I don't own IP ID. I published it as, Hey, here's the simple hack to actually, it's more publishing. It did document the other companies and vendors can actually come up with that wallet solution for performing that those actions for you. But this is an elegantly simple at this, I think is now even a simple approach. Speaker 3 00:35:44 I think that this discussion leads well into, uh, something we talked about actually in our pre-production call, um, that you brought up the importance of understanding not only the specification for any of these did methods, but also the implementation and particularly the specific implementation that you want to use or make use of, um, which I think ties well into this notion of, uh, managed wallets or hot wallets, um, as they're often called. So just wanted to give you the opportunity to maybe speak a little bit to the importance of kind of knowing your implementation of any of these, uh, cryptographic systems, Speaker 4 00:36:22 Um, important. You want me to dive into more detail? Uh, but as I mentioned, I think it is, uh, it says as secure as you want it to be. So it's, uh, your methods, uh, you can actually, I can, for instance, I have a, a PIV card or like, you know, you, you like just basically PGP keys and so how well do you keep your PGP key safe? And so, um, so, um, yeah, PGP of course is pretty good protection. It's been going back 20, 30 years. So the public key infrastructure that we use for PGP keys is, is not new. It's been around for a long time. So how do we keep those keys, private and secure? Uh, and I have different various methods. I unfortunately actually have one that's under Johnny crunch from 1997 and I'd actually at the time, um, didn't really quite understand I was just playing with it and never created an expiration date. Speaker 4 00:37:15 So of course that was actually published, uh, in the MIT server. And I think it might still be in an MIT server, but I never created an expiration date. Now I've actually gotten to have, now I'm just standing there being more, um, uh, the student is actually, there are, I rotate my keys for my PGP, but this is this descent. This public key infrastructure is no different. This is basically the same idea that we've had for 2030 years. The only difference is that there's not a centralized server to actually to sharing them. It is like up to me, the author to actually share the document and be as secure as I want to about me rotating my keys, saving my keys, backing up my keys and encrypting my keys if for my purposes, some I actually have on a PIP card, which is a, I think, a personal, uh, ID card. I remember what the I, I in the B stand for, but basically it's a, like a government chip card. And so like, you hit get from Estonia. Um, I have ones that are on, um, the hardware security modules. Um, I have one that's in my phone, um, you know, obviously ones that baked into my, my laptop and, you know, and then there's the hardware wallets for, for instance, that, you know, that you can use as well. Speaker 3 00:38:26 Is there anything in particular you would recommend when someone was maybe looking for a piece of wallet software to use? Is there anything that you look for in particular in wallet software, or is it mostly just have to do your own due diligence and dig into it yourself? Speaker 4 00:38:41 Yeah, I think, um, ultimately that's up to the end-user to be as paranoid as you'd want it to be, and also as convenient as you want it to be. Uh, so I still, I still struggle with myself and my wife is like, you know, my crypto for instance, is like, I don't, I have a good mechanism to share it with my wife, for instance, right now, because I think, uh, you know, she would save it in a contact list if she would, if she could. And so I don't think that there would be a good mechanism, um, to save, um, the private key associated with my wallets. Uh, so I have different mechanisms including offline, offline wallets. And so, um, but depends on what I'm, I'm storing. And so is it like, um, is it my pseudo identity, Johnny crunch? Is it my professional credentials for my medical license? Speaker 3 00:39:31 You're trying to secure your life savings over here or is it just a Speaker 4 00:39:36 So, and it's no different from my password. So I have password management software and, uh, every say I'm probably more advanced user, um, and I have more advanced, uh, mechanisms to keep things backed up. Um, but in the end it's just a, it's a secret. So how do you, how do you keep track of your secrets? Um, I would recommend, um, backing it up. Um, I think as, uh, some author said that you should tattoo it on your scrotum or someplace private, um, Speaker 3 00:40:08 You wouldn't forget where that happened. Speaker 4 00:40:14 Uh, Christopher Allen actually has his, um, titanium, um, plates that he's demoed before. Uh, there's, uh, you know, writing them down and sharding them with snores signatures and sharing it with X number of friends and having a mechanism to recreate them. There is, uh, all sorts of ways of actually, uh, storing the perfect keep. That interesting thing is actually that are quite small compared to other, you know, secrets. And so it's relatively easy to create a good method for backing them up and encrypting them and storing them. Speaker 1 00:40:50 Johnny, what were the complications you faced in trying to get IP ID working other technical challenges you were came or personnel or political? Speaker 4 00:40:59 Oh yeah. So political for sure. So I think what we started to talk about is that when you publish the document, it's, you can write it in and Jason JavaScript, object notation, but in order to canonically use it to make a deterministic. So you basically get the same hash, no matter what the ordering of the keys are paid, transforms it into seaboard. So concise binary object representation, which is actually a superset of Jason. And so there's all sorts of controversy in that working group that we're involved in about, um, supporting CBOE or not supporting seaboard. Uh, there's a new approach called seaboard LD for seaboard link data, which is just an emerging to be a specification. And so there's all sorts of just, um, nuances of, well, people want to make it simple. They just want it to Jason. Jason is limited because it's not a byte representation as a spring representation. Speaker 4 00:41:49 So you get slightly different ordering depending, and you have to validate you need a, uh, normalization algorithm. So, um, so I I've been advocating, uh, seaboard all along, which is a superset of Jason to make it canonical too, but that has been an uphill battle. So in the current version of the IP ID, uh, did method, uh, there is a reference to a slash that slash is supposed to represent the tag 42, which is, uh, represents a CID. So basically is the batteries included, uh, self descripting model of what this, this thing comes after. It is supposed to say it to this as a CID. So you can do that in C4 and it's hard to do in Jason. So in as part of my task, uh, now since we're on the podcast is to go back and actually update that document, which is now two and a half years old, to be more, um, uh, uh, con conform it with the now published, uh, did specification. So certainly I think, uh, political challenges, uh, I don't think this is like technical. Um, you know, so the, the subtleties of the, the method really are as simple as is using IPFS. And you can write lightweight IPFS nodes that do certain only certain of those tasks without actually have to running a full node and share all that data most, I guess it would be just the political non-technical. Speaker 1 00:43:14 You have mentioned that the specification itself is a little bit out of date. Uh, and then is there a timeline for the next version? Speaker 4 00:43:24 Uh, sure. Uh, let me give myself a week. Speaker 1 00:43:28 Great. So you might have it done before we, uh, we go to press with, Speaker 4 00:43:32 Yeah, I think, I think that the details of the, of this, of the documentation just talks about creating and updating and deleting it. And so, but the, the, the subtleties of, um, as it stands now, the IP ID document can stand and it is conformance. And the reason being is that there is, um, we all agreed with a resolution on the did working group, that it is a, um, uh, what is it, a key by default paradigm that if there's something that is not registered and did spec registries or something that you don't, that doesn't conform to the specification, you have to keep it in. So that means things like, um, I, the example I give is in previous, which is a key or a property in the document that I give an example of, or the slash that represents the tag 42 CID, that previous propensity. Speaker 4 00:44:31 And this allows you to describe the following content is a CID. Those actually will be passed by default, meaning even though it's not fully conformance that any did method or resolve, or can't remove those, that those are should be retained. So they're just, they're basically sprinkling of additional information. Um, so the previous, uh, property, we didn't dive into this in much detail, but I, we IPLD interplaying linked data is basically a dag directed, a cyclic graph, a having the benefit of a previous field gives you a pointer to the previous version of the document. So as you make revisions, and basically there's always a pointed to the previous one, that's step number one to add to the security. The second piece is something, um, as an anchor field. And that anchor is that if you, as I mentioned, uh, IP ID does not use a blockchain by default, but you can use blockchains to anchor the proof of existence of your dead document at a given method at a specified time. Speaker 4 00:45:44 And the reason why that's valuable is that you want the ability to walk back in time and show that this doc did document in the public key associated with, let's say, a verifiable credential existed at a certain point of time. And that's where blockchains are really handy. Is that actually having a sequential ordering of events that you can go back, let's say 10 years prior and verify that this, yes, this key existed, um, uh, non repeatably, because otherwise, if you just basically upending each key to the array and the final version of your did document is the, basically the final version. And he just say, you don't have a blockchain underpinning it that you can actually can audit. Then all you're doing is trusting the latest version of the document. So it really is a, what we call a side tree approach to blockchain technology, which is really a lightweight directly a cyclic graph, very similar to how block chains work, which is a pointer to the previous block by in this case, it's appointed to the previous version of the document and ideally with an anchor as well, that ever proof of the existence. Speaker 4 00:46:52 Now you can, I've done this with, um, experimenting with a proof of existence, smart contracts. In fact, actually the default ganache, which is the sort of developer tools for Ethereum as a default proof of existence smart contract that just publishes the bytes too. And so that basically is one example of doing it. And the other one is the open timestamp protocol, which is Peter Todd, where the Bitcoin core developers has a protocol at this open timestamps and does a similar side tree approach of rolling up a number of Merkel dogs and including that into, and I believe in opt return in Bitcoin blockchain. Speaker 1 00:47:33 So the very next step is to update this specification to be conformed with the Dade core specification, which is on the cusp of being published by the W3C. Um, is there something after that, is there an IP ID 2.0, or where do you continue in this work if, if IPI D is done and on a shelf and good to be used, what's next? Speaker 4 00:47:56 Yeah, I think, uh, you know, personally, it's, um, the, the next step of this is actually is the key management. And how do you rotate your keys? How do you come do, let's say M events, signatures. So basically you want to have a key sinus. And so also, how does this play into the IPFS ecosystem with the PDP and, um, support for, let's say the sec curves in the records. So I think it really is continuing to work and helping out the IPFS community, uh, to, to bolster the use of IPI D in the ecosystem. I would love to see IP ID as a sort of like the default did method for all of the identity systems that we need in IPFS and follicles. Speaker 1 00:48:43 Very cool. Well, this conversation is, has definitely inspired me to consider using, uh, did IP ID for an application we're developing. Um, I had already pretty much said, Hey, you know, did webs an easy starting point. Um, but I don't, it doesn't feel like did IP ID is much more work. I mean, I have to figure out how to plug into IPFS and the IPNs, but those are existing systems in their APIs and libraries for it. So, um, it seems like it might be an easy lift for that application before we get to Speaker 3 00:49:14 Or wrap up. Is there anything that you wanted to touch on that we might've missed? Speaker 4 00:49:20 Yeah, I think I would probably just put a plugin for, uh, PRS, uh, pull requests or welcome on the did method. And I think you'll put the, some, a link into the show notes, but I think that is referenced in the I, the specification or the did spec registries as a list. And the IP ID is all point through to the IP ID did method. And so if people actually wants to take a look at it and to get feedback or suggestions for improvements, um, that PRS welcome Speaker 2 00:49:51 Actually now time for our shameless plug segment of the Speaker 1 00:49:55 Podcast. I have one Erica w which I'm stealing what you may have done if you had thought about it, which is my shameless plug is for the Dave Matthews band in particular, I guess it was just, just yesterday, Erica. They, they performed their first concert in 520 days. Um, and it was broadcast widely by fans. It was, uh, it was fun and it was beautiful. And, uh, we'll get through this COVID thing one way or another. And thanks, Dave, for getting back up on stage and sharing the joy of music. Speaker 2 00:50:27 Thanks, Dave. Yes, it was Friday. It was Friday night, their first concert, I think, but, uh, there was so much excitement over the crowd being able to get together, but I did have that feeling like, oh, there's, oh, I don't know. That's what it might be a little early for that kind of catheter. Was he at one of the, uh, only vaccinated allowed venues? I don't know. That's a great question. I hope so. I know a lot of, a lot of places. Speaker 4 00:50:52 Yeah. I think, um, I I've, I, even though I wrote the, the verifiable credentials, uh, use case for verifiable vaccinations, I still have reservations about it. And that has a doctor. Like, I don't know if that should be a driving use case for your public credentials, cause it's, it, it stirs up a whole bunch of social economic controversies and that I feel uneasy about. And I think, um, even though I wrote one, I have one that rec works on my phone that I wrote it's, um, that you of course uses IP ID, but it is, um, there's the, the, the social dilemma of actually they have to have the technology. And I think, um, and what I've learned in my experience, working with IPFS, for instance, um, you know, what really is one of my passion to have involved being involved with the project is the censorship resistance. Speaker 4 00:51:43 So has, uh, uh, Joe and I actually had a discussion of about this while we're in Barcelona. I think talking about how, um, IPFS, um, supported the, um, uprising of the district of Catalonia and which is where Barcelona is located and the reference referendum voting for independence. And so, uh, huge amounts of protests and et cetera, and what the IPFS community did was we were able to replicate the voting information for, to be, to allow people to find out where they're going to vote, uh, to actually to encourage people, to actually, to, to make their own decision, but to, to enable the democracy to occur. So IPFS by nature is a censorship resistance resistant protocol. And that I think, um, uh, it sort of speaks to this decentralized or distributed nature of self sovereignty that I have control over my own identity. I have control over my own, uh, document or my public keys, and there's no central point of failure. Speaker 4 00:52:50 So the same thing with, uh, the Turkey. So the Turkey Wikipedia site, when there's an uprising in Turkey, they turned off Wikipedia. Can you, can you imagine a country where they turn off Wikipedia because they don't want people to have access to it? So what we did in IPFS was we backed it up and the entire Wikipedia, including the Turkish version of Wikipedia is now on IPFS. So it's, it speaks to the distributed nature of this. And the same thing is happening in China right now, which is that, um, there's a, the great wall of China. Well, when you have a protocol agnostic communication, the, a satellite or a Starling or you name it centers, censorship becomes a lot harder. And I think that really in my mind, facilitates self true self sovereignty and really democracy in general. Speaker 1 00:53:41 Yeah. I love that. Um, one distinction I'd, uh, offer to the conversation about VCs for vaccine records or vaccination passports or whatever. I actually think it's a great use case. Not because the proposed solutions solve the social problem, but because it teaches us a lot when we understand how it fails. So VCs do some things, right. But all these other problems aren't quite addressed. And so that to me makes it very useful to understand better where we need to take this technology. Speaker 4 00:54:12 Yeah. And it's certainly, it's highlighted me the risks of correlation. So even in, let's say other did methods like did peer, for instance, where you're basically creating a one peer ID for each relationship. Well, what's the risk for correlation and let's say the restaurants or, um, health care, let's say I'm, I'm involved in healthcare use cases right now where they actually want to do correlation across all your identifiers. Well, that kind of gives me pause as far as like, well, that's what exactly we're trying to avoid with the did methods. So I think, uh, so it just gives me pause as far as like, this is the brief slope of how these will be used. And I guess credit also speaks to the underlying, um, decentralized nature of it. And it really reminds me of, uh, Tim Berners Lee. So at the rebooting web of trust, I actually had the honor of meeting Tim Berners Lee. Speaker 4 00:55:01 And unfortunately I pissed him off when I first met him. And the reason being was that I was at the time, Joe, if you remember, um, Christopher Allen was asking about what are the hidden signals we're not talking about with, with, with Rupert and web of trust, but it's a web of trust. And I sort of rose my hand and talked about porn and you don't talk about porn in polite conversation with, and so I, as I was talking to, uh, about basically how every technology actually has its moments and in the porn industry, um, it was beta versus VHS. Um, or you just think about how porn has been a major driver for video codecs, with payment models and drives 30% of all internet traffic. And it's not that I'm advocating that DVDs should basically go after the porn industry. I'm just saying like, these are social implicate implications of our technology that we should be aware of. And so, but I left it, um, with, with Tim not wanted to talk to me. I went to up to him afterwards and just tried to jokingly say, like explain myself, but he was all flustered. And actually wouldn't talk to me, he's one of my heroes. So, but it was, uh, somebody really is about this, the social implications of our technology. And we need to sort of understand those, um, and, and be mindful of those before we actually just embraced them Willy nilly and sort of run down the road Speaker 1 00:56:22 With other than IPI D what is your favorite did method, Speaker 4 00:56:28 I guess, uh, did ki so it's basically it's inline. Uh, so it's basically one-off key. That is a simple, elegant one-off key that actually uses, um, multi codex, which is part of the IPU pro IP LD pro I'm sorry, IPFS project. Speaker 1 00:56:49 Excellent. Well, hopefully by the time this gets published, we will have published our, uh, episode that is covering did key. So look for that. If you're listening to this, go check out our episode about Dicky, Speaker 2 00:57:00 And that will bring us to the end of our show today. Speaker 1 00:57:03 Thank you, Johnny, for joining us today. Thanks also to our staff, our producer, Erica Connell, and our co-host Eric Shu. I'm your host, Joe Andrew, Speaker 2 00:57:12 Where you find the rubric podcast, please take a moment to subscribe to our feet. So you'll be notified when our next episode is released. We look forward to you joining us next time. The information opinions and recommendations presented in this podcast are for general information only. And any reliance on the information provided in this podcast is done at your own risk. The views, thoughts, and opinions expressed by the speakers in this podcast belong solely to the speakers and not necessarily to the speakers, employer, organization, committee, or other group or individually.

Show Notes

Episode Transcript

Other Episodes

Episode

Faster, Cheaper, and More Private: the Sequel IS Better! (did:btcr2, Part 2)

Episode 0

Introducing The Rubric

Episode 0

Before the Beginning (did:v1, Part 2)