I'm been thinking about trust and relationships on the Internet, specifically social networking and news. Here is one possible approach to handle it.
In the last few days, I briefly got into a discussion about Twitter's poor management of death threats verses how they handle swearing to verified accounts or banning for using specific posts. It was frustrating for folks because they felt that a tweet that said “I hope you die in a fire” somehow didn't violate community standards but a relatively mild hash tag did.
At the same time, there was a couple flame wars on my feed (he said, she said) where there was blocking left and right as people picked sides. There were a couple of egg accounts (people starting up anonymous accounts to attack others) involved.
This was all against a backdrop of fake news sites that influenced the recent elections and the general attack against science.
The problem comes in that Twitter, Facebook, and most social networks aren't capable of policing their networks. Twitter has six thousand tweets per second) and Facebook is close to a quarter million per second. Assuming that the conflict rate (the number of people who disagree or agree passionately with each other) is relatively constant, that is a lot of reported posts.
This also means that when a human is reviewing it, they are dealing with a deluge of data. It doesn't matter if it is a social network reported post, Amazon reviewing books being published, or even a slush pile reader, going through so many posts of vile hatred, disgust, and terrible things will break anyone. That is one reason why content moderation is a difficult job that inflict PTSD on even the most stalwart of individual.
Also because humans are so adaptable, it is difficult to automate flagging content in a consistent manner. That means that one person may not think “die in a fire” is offensive while another (after spending an hour looking at car accidents) thinks it nothing compare to death while a third is offended and will mark it as such. Regardless of how the item is marked, there is a high probability of it being escalated to a “manager” which is just another person inundated with too much information.
I've mulled over this idea a few times over the last year, but I was mowing yesterday and got to thinking about it. The general idea is that I think crowd-sourcing “community standards” might be the approach but as I learned “common sense” doesn't exist. You have individuals who are opposite ends of the same idea (regardless of who is wrong) and they don't like when the “other side” is in charge of content (usually the words “fake news” or “censorship” shows up when this happens).
Everyone thinks their beliefs are “common sense” and “community standard”. In reality, there really isn't such a thing. Seeing the hatred on the network points that out, there are simply too many people who think that sending death threats (I've gotten my share) is perfectly “acceptable.”
Web of Trust
I think this can be done by establishing a web of trust between individuals. There are a few components to the web that is needed for this to work.
The first is a relationship between an individual and a target (which is an individual's presence). This is an opinion between two people. So, if Alice blocks Gary, then is represented by a negative relationship (
Alice - Gary). If Alice vouches for Mary, then it would be a positive relationship (
Alice + Mark). In this case Alice, Mary, and Gary could be social accounts, Twitter handles, or Facebook identifiers.
A relationship would determine someone is blocked by having a negative score.
The second is a connection. A connection is a relationship between two individuals where the first person trusts the second person's opinions. For example, if Steve feels that Alice fits his community standard, he “trusts” her (
Steve < Alice). This can also build up chains, such as
Mark < Steve < Alice. It is a web because an individual may have multiple connections, such as
Mark < Steve, Mark < Jenny.
The direct relationships are pretty easy. Alice doesn't like Gary (
Alice - Gary). The difficulty comes into passing that relationship down. This leads to the third concept, the threshold. A threshold is relatively simple but there is both a positive threshold and a negative threshold. A positive threshold is the point where the sum of all connections's opinion about an individual becomes a positive rating. The negative is when the sum is lower and becomes a negative opinion.
This might need a few examples. Say, Steve says “if anyone hates someone, then I hate them”, this is a negative threshold of -1. So when
Alice - Gary is established, then the total that Steve sees is -1 and then they also create a negative relationship with Gary (
Steve ?- Gary). I'm using the “?” to say it it comes from the trust instead of explicitly being set.
On the other hand, say Mark requires two trusted connections. When Steve established a negative opinion of Gary, then Mark saw a total of -1 for Gary. However, he requires -2 so he doesn't establish a negative (blocking) connection to Gary. Now, if Jenny also had a negative opinion (
Jenny - Gary), then Mark would see -2 and would then create their own opinion of Gary (
Mark ?- Gary).
We could also create a warning threshold, but the idea is pretty much the same. Likewise, we could also establish a reverse connection (whatever Alice likes, Gary hates or
Gary > Alice).
Say Jenny doesn't have an opinion of Gary (get rid of the
Jenny - Gary) but Mark also trusts Alice's opinion (
Mark < Alice, Mark < Steve, Mark < Jenny). When Alice established a negative relationship with Gary, it fed into Steve's (as above) and also Mark's. Mark then sees -2, one from Alice and one from Steve. This would then create that negative relationship.
The advantage of this system is that someone can have a low negative threshold when they are having trouble with being overwhelmed or a high negative threshold which requires more people to have an opinion (if everyone at the table disagrees…). It could also be dynamic based on their current situation.
One of the biggest complexities with this is how to communicate these relationships. I don't trust a single site, organization, or company to manage these relationships. Sooner or later, they will have to figure out some way to pay the bills and that means using the data for advertising or income.
Multiple providers, on the other hand, helps reduce that since individuals don't have to deal with as much network, process, or requirements. It is easier to spread the costs among others.
I wasn't sure how to make this happen until recently. In the last month, Mastodon has gotten a bit of press. The idea behind it is a good one, a federated social network. It isn't an original idea, though, but they have a pretty solid implementation of the GNU Social network.
We could use the same concept for sharing and communicating. The federation and protocol would work out nicely using existing and tested protocols. It also means we can use relatively simple text messages to communicate the various relationships. At the same time, “following” an account would be the same as establishing a trust relationship.
Alice - Gary would actually be
The target of a relationship (the
Alice - Gary) doesn't have to be an account. Instead, it is just an arbitrary identifier. There is already a pretty good standard for that. In effect, we can use the URL of an account as the target with some standard rules of how to parse specifics URLs (trailing slashes, capitalization).
Of course, these can be really long which causes a problem. Fortunately, we can easily hash this result (MD5) to reduce it to a consistent length and then encoding it into hex. We don't have to worry about duplicate hash entries since we are dealing with URIs usually don't have the “room” to create a duplicate hash.
When we do this, we get:
- 1dca72d547e37c33cbdbdeeecd3d6119 https://twitter.com/dmoonfire
- c877a19c5bdb62a2e3e946411fb75dd5 https://d.moonfire.us/
- 1a7e49dc5912887d23a727c3bd933750 https://ello.co/dmoonfire/
Now there are cases where two identifiers/targets are the same one. This can be done the same thing, with a trusted network identifying an equality or inequality.
- My two URLs are the same:
1dca72d547e37c33cbdbdeeecd3d6119 = c877a19c5bdb62a2e3e946411fb75dd5
- My two URLs aren't the same:
c877a19c5bdb62a2e3e946411fb75dd5 != 1a7e49dc5912887d23a727c3bd933750
We could say an equality threshold and inequality threshold to then pass it through the network like relationships above.
All of this can be done with messages posted to the federation through a GNU Social based network. Other instances can listen to the operations and make changes on the individual accounts.
- The message is passed from
email@example.com the message and updates their record.
As I see it, there are only a few operations:
ADDto create a positive relationship
SUBto create a negative one
EQfor establish an equality
NEfor establishing an inequality
There are also a few commands to retrieve data. The only one I can see is the need to retrieve the current list.
firstname.lastname@example.org a message to
email@example.com with a temporary URL with a full list.
Now, I could easily see there a privacy problem. Even with encoding, it would be easy enough to look for
SUB 1a7e49dc5912887d23a727c3bd933750 as someone who dislikes me. The accounts don't give too much privacy because they have to be posted in a semi-public location.
To make it hard to detect these, accounts can establish an AES encryption key that all messages are encrypted. For simple ofuscation, the key can be posted in the public profile of that account. Anyone subscribing can decrypt it, but it would take significant effort to scan the network to find specific operations since every account would require a different encryption key.
Or, an account to have an encryption key set but share it out-of-band for private federation.
The end result is we would have an account on a federated instance. The account would have the number of private variables:
- positive threshold: 1
- negative threshold: 1
- equality threshold: 1
- inequality threshold: 1
- private key: binary
There would also be a flag if the private key is exposed to users.
It would have the relationships with their optional encryption keys.
It would have the relationships that the user specifically set on their own (they trump everything).
EQ c877a19c5bdb62a2e3e946411fb75dd5 1a7e49dc5912887d23a727c3bd933750
We also keep track of the relationships we got through our connections.
firstname.lastname@example.org ADD c877a19c5bdb62a2e3e946411fb75dd5
email@example.com ADD 1a7e49dc5912887d23a727c3bd933750
These would be combined into a single unified list (assuming thresholds of 1 for this example).
ADD 1dca72d547e37c33cbdbdeeecd3d6119(maybe a score of 100)
SUB c877a19c5bdb62a2e3e946411fb75dd5(score 1)
SUB 1a7e49dc5912887d23a727c3bd933750(score 1)
As operations are federated, they would filter through the various inputs and produce the final consolidate list above.
Overall, these are very simple database structures.
The final step for this is how to use it. With the above example, it could be a very simple REST request to determine the status of any relationship. It would require an application key (something that Mastodon already has so easily done) but would just request a HTTP code or status.
The output would be a relatively simple JSON response that gives the final state (
SUB) but maybe allow for the equivalents to also be retrieved. I'm not sure what is needed.
Instead of just returning
SUB from the REST, it could return the actual amount with the user's explicit setting. That would allow for hidden, obscured, warning, or a “verified” check mark of sorts.
Here is how I could see it being used. A user script (TamperMonkey) could easily be developed that uses an application key and URL to request the block status for any identifier. It could then look at the page as it loads, construct a URL, encode it into MD5, and then use REST to get the status and determine if the specific elements of the page needed to be hidden, highlighted, or otherwise indicated of a positive or negative result.
Assuming the results were cached reasonably, it could remain relatively performant. Likewise, we could use a POST operation that retrieves multiple records at once to avoid network overhead.
Websites and applications could use the REST directly to do the same thing. That way, sites like Ello or Mastodon could integrate the calls into their own retrieval so they don't ever show up on the user's website.
It could also be used to identify the reliability of websites (it's just a URL) with a TamperMonkey script putting a banner up (this is fake news!)
I could also see hashtags used to identify specifics or communicate additional ideas. For example,
SUB 1a7e49dc5912887d23a727c3bd933750 #klingon to indicate negative opinions about Klingons. That is a more complicated step and I think the rest needs to settle beyond one night's writeup.
Well, that's my idea along with rough implementation ideas of how to create a network of trust and distrust between people that would be relatively organic and doesn't rely on a single company to ensure it works. It is based on existing libraries, frameworks, and infrastructure which means I think it is reasonable to implement.
Opinions are always appreciated.
Edit 1: Modifying Results
The same tools could also be used to create the direct relationships. So a POST that adds the “ADD” or “SUB” above based on a TamperMonkey-added button next to an account entry. Likewise, websites to pass the like/dislike through a REST POST call for the same effect. That way, there is almost no friction for contributing to the federation.