Best Practices - Identifying Spam

From The Socknet

Jump to: navigation, search

Spam is unsolicited, unwanted communications.

Contents

One Kind of Spam

There is only one kind of spam in the Socknet: friend request spam. This refers to friend requests which don't represent users, but simply act as advertisements to the user.

What constitutes spam is defined by the users.

When a user receives a friend request that he thinks is spam, he marks it as spam and his provider sends a spam report to all of his friends using the Gossip system. The provider should not accept further friend requests.

When a provider receives a friend request that has been reported as spam too many times, it should put the request in a section marked "Probably Spam".

However, providers should be careful about trusting their users too much. Users may purposefully mark others as spam in an attempt to deny them service. No user's input should have too much weight.

Why only friend requests?

A user is able to regulate everything that he sees except friend requests.

  1. A service cannot spam, because it must register with the user or already be communicating with the user in order to post messages.
  2. A friend or service that sends too much information can simply be gagged.

Friend requests are the only completely unsolicited information coming into the system.

The Specification

When a user marks a friend request as spam, his provider records that information in its own database and reports it to his friends using the gossip function with the isa string "spammer".

Example:

POST gossip
{ from: ...,
  gossip: [
    { openid: "http://spammer.com" 
      provider: "http://provider.com/user/12/",
      isa: [ "spammer" ]
    }
  ]
}

The entity's guid also includes a provider field. If the the user knows he's been outed, he will probably disappear suddenly. The provider field identifies the provider he was using in that case, so that it may also be marked as spammy.

For fairness the spammer is told they are being reported by sending them the same report.

One reason for this is so that users who are accidentally marked as spam can quickly resolve the problem. A user accidentally marked as spam can contact the friend who reported him through other means, and that user can go back to his provider to unspam the friend.

The other reason that spammers are told that they are reported as spammy is so that the spammer's provider can check whether the spammer is really sending spam and silence him to protect its own reputation.

Unspamming

Unspamming is just like spamming, except that the gossip uses the nota field.

POST gossip
{ from: ...,
  gossip: [
    { openid: "http://spammer.com",
      provider: "http://provider.com/user/http___spammer.com/",
      nota: [ "spammer" ]
    }
  ]
}

The Long Arm of the Spamminess Rating

A spammer is a user or provider that is marked as a spammer by a user.

Spamminess is determined in any fashion that the provider likes. There will be a suggested system. Generally it's represented as a percentage, but since it is entirely internal, the mechanism doesn't matter to this specification.

The gossip does not include spamminess ratings. If it did, it would be a mild security risk and more importantly it would be meaningless. Spamminess is entirely subjective, and this document makes no mention of a standard threshold.

More over, if spamminess was reported, a standard would probably arise, and it might not be a good one. Each provider is charged with the duty of determining how spammy a requester is allowed to become before action is taken.

A user's spamminess should "rub off" on his provider. It should also affect all the parts of the provider's domain name.

For example: A provider named x.a_b_c.com could also run under the domain name y.a_b_c.com with very little extra work, so it is imperative to mark a_b_c.com as spammy too. But it should be less spammy.

As a special case, TLD's are exempt so com is completely unspammy. A white list can help with things like co.uk and com.bo etc (ETC! blah).

A provider is free to use any system to determine spamminess, and even ignore spam reports.

However, a provider SHOULD send spam reports to friends even if it ignores them itself.

Time Heals All Things

Spam data MUST be abandoned after some time without activity. Activity means spam reports from users and their friends.

Spammers are constantly on the move, and quickly abandon domains. Those domains should become usable again to real users.

Subjectivity

At no time should a barrier be put up to anyone wanting to communicate. Even if a friend request is considered spammy, it's up to the user to accept or reject it.

Maybe the spammers can all be friends with each other.

That said, a provider is free to kick anyone off of its service.


Share and Share Alike

Spamminess is a rare example of information that should be shared among all of a provider's users. It is in everyone's best interest to spread this information as far as it will go. It is also in everyone's best interest to reduce the effect of this information by using weighting.

It is acceptable for providers to share spam databases if they trust each other and they don't include weighting.

Personal tools