• 1 Post
  • 5 Comments
Joined 1 年前
cake
Cake day: 2023年6月8日

help-circle

  • So lets be clear - there is no way to prevent others from crawling your website if they really want to (AI or non AI).

    Sure you can put up a robots.txt or reject certain user agents (if you self host) to try and screen the most common crawlers. But as far as your hosting is concerned the crawler for AI is not too different from e.g. the crawler from google that takes piece of content to show on results. You can put a captcha or equivalent to screen non-humans, but this does not work that well and might also prevent search engines from finding your site (which i don’t know if you want?).

    I don’t have a solution for the AI problem, as for the “greed” problem, I think most of us poor folks do one of the following:

    • github pages (if you don’t like github then codeberg or one of the other software forges that host pages)
    • self host your own http server if its not too much of an hassle
    • (make backups, yes always backups)

    Now for the AI problem, there are no good solutions, but there are funny ones:

    • write stories that seem plausible but hold high jinx in there - if there ever was a good reason for being creative it is “I hope AI crawls my story and the night time news reports that the army is now using trained squirrels as paratroopers”
    • double speak - if it works for fictional fascist states it works for AI too - replace all uses of word/expression with another, your readers might be slightly confused but such is life
    • turn off your web site at certain times of the day, just show a message showing that it only works outside of US work hours or something

    I should point out that none of this will make you famous or raise your SEO rank in search results.

    PS: can you share your site, now i’m curious about the stories


  • Here is my take as someone who absolutely loves the work simplex did on the SMP protocol, but still does not use SimpleX Chat.

    First the trivial stuff:

    1. no one else seems to use it
    2. UX is not great because of initial exchange

    These two are not that unexpected. Any other chat app with E2E security has tricky UX, and SimpleX takes the hard road by not trading off security/privacy for UX. I think this is a plus, but yes it annoys people.

    Now for the reasons that really keep me away:

    1. the desktop app is way behind the mobile app - and I would really prefer to use a desktop CLI app
    2. haskell puts me off a bit - the language is fine I just don’t know how to read it - for more practical issues it did not support older (arm6/7) devices which kept lots of people in older devices away
    3. AFAIK no alternative implementations of either the client or the SMP server exist - which is a petty I think the protocol would shine in other contexts (like push notifications)
    4. I was going to say that there are not many 3rd party user groups - but I just found out about the directory service (shame on me, maybe? can’t seem to find groups though)
    5. protocol features/stabilization is a moving target and most of the fancy new features don’t really interest me (i don’t care much about audio/video)
    6. stabilization of code/dependencies would help package the server/client in more linux distros, which I think would help adoption among the tech folk

    Finally a couple of points on some of the other comments:

    • multi device support - no protocol out there can do multi device properly (not signal, none really) so i’m ok with biting the bullet on this
    • VC funding is a drag - but I am still thankful that they clearly specified the chat protocol separate from the message relay, which means that even if the chat app dies, SMP could still be used for other stuff.

  • First of all, you can assume the server can infer this in a number of ways - there is actually no way to fully block it, but we can try.

    The main issue for privacy is that it makes your browser behave in ways that are a bit too specific (i.e. less private by comparison with the rest of the browsers in the known universe).

    As for techniques the site can use

    • javascript can test the geometry of something that was rendered to draw conclusions - was this font actually used? test several options and check for variations
    • measure font work between network events i.e. generate a site that makes the browser use unique links for 1) fetches a font 2) renders text and 3) only then another fetch - measure the time between 1) and 3) and draw conclusions. Repeat for test cases and draw conclusions - e.g. is the browser really fast using monospace vs custom huge font? not a great method, but not completely worthless
    • some techniques can actually do some of this without Javascript, provided you can generate some weird CSS/HTML that conditionally triggers a fetch

    By the away not downloading the fonts also makes you “less private”. Some of this is a stretch but not impossible.

    Now for a more practical problem. Lots of sites use custom fonts for icons. Which means some sites will be very hard to use, because they only display buttons with an icon (actually a letter with a custom font).

    FWIW these two lines are in my Firefox profile to disable downloads and skip document provided fonts:

    user_pref("gfx.downloadable_fonts.enabled", false);
    user_pref("browser.display.use_document_fonts", 0);
    

    If someone has better/different settings please share.

    Finally the Tor browser folks did good work on privacy protections over FF. Maybe their issue tracker is a good source of inspiration https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/18097