Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 137 Posts
  • 6.79K Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle

  • I once wrote code for an elderly researcher who would only review code as a hard copy. I’d bring him stacks of paper and he’d get going with his pen and highlighter. And I’ll grant that the resolution is normally higher on paper than on most displays. I’m viewing this on a laptop screen that’s about 200 ppi. A laser printer is probably printing at a minimum of 300 dpi, maybe 600 or 1200 dpi.

    I still think that the few people reading thing in print are the exception that proves the rule, though.



  • Times New Roman was designed for the print era, and Calibri for onscreen viewing. Onscreen viewing is a lot more common today. Based on that technical characteristic, I’d be kind of inclined to favor Calibri or at least some screen-oriented font.

    That being said, screens are also higher-resolution than they were in the past, so the rationale might be less-significant than it once was.

    https://en.wikipedia.org/wiki/Calibri

    Calibri (/kəˈliːbri/) is a digital sans-serif typeface family in the humanist or modern style. It was designed by Lucas de Groot in 2002–2004 and released to the general public in 2006, with Windows Vista.[3] In Microsoft Office 2007, it replaced Times New Roman as the default font in Word and replaced Arial as the default font in PowerPoint, Excel, and Outlook. In Windows 7, it replaced Arial as the default font in WordPad. De Groot described its subtly rounded design as having “a warm and soft character”.[3] In January 2024, the font was replaced by Microsoft’s new bespoke font, Aptos, as the new default Microsoft Office font, after 17 years.[4][5]

    I suspect that the Office shift is probably a large factor in moving to Calibri.

    That being said, there are many Times New Roman implementations, but it sounds like Calibri is owned by Microsoft, so I’d be kind of inclined to favor something open.








  • If you mean distributing inference across many machines, each of which could not individually deal with a large model, using today’s models, not viable with reasonable performance. The problem is that you require a lot of bandwidth between layers; a lot of data moves. When you cluster current systems, you tend to use specialized, high-bandwidth links.

    It might theoretically be possible to build models that are more-amenable to this sort of thing, that have small parts of a model run on nodes that have little data interchange between them. But until they’re built, hard to say.

    I’d also be a little leery of how energy-efficient such a thing is, especially if you want to use CPUs — which are probably more-amenable to be run in a shared fashion than GPUs. Just using CPU time “in the background” also probably won’t work as well as with a system running other tasks, because the limiting factor isn’t heavy crunching on a small amount of data — where a processor can make use of idle cores without much impact to other tasks — but bandwidth to the memory, which is gonna be a bottleneck for the whole system. Also, some fairly substantial memory demands, unless you can also get model size way down.



  • I wonder how much exact duplication each process has?

    https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html

    Kernel Samepage Merging

    KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y, added to the Linux kernel in 2.6.32. See mm/ksm.c for its implementation, and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/

    KSM was originally developed for use with KVM (where it was known as Kernel Shared Memory), to fit more virtual machines into physical memory, by sharing the data common between them. But it can be useful to any application which generates many instances of the same data.

    The KSM daemon ksmd periodically scans those areas of user memory which have been registered with it, looking for pages of identical content which can be replaced by a single write-protected page (which is automatically copied if a process later wants to update its content). The amount of pages that KSM daemon scans in a single pass and the time between the passes are configured using sysfs interface

    KSM only operates on those areas of address space which an application has advised to be likely candidates for merging, by using the madvise(2) system call:

    int madvise(addr, length, MADV_MERGEABLE)
    

    One imagines that one could maybe make a library interposer to induce use of that.








  • I’ve also noticed that is you want a chest smaller than DDD, it’s almost impossible with some models — unless you specify that they are a gymnast.

    That’s also another point of present generative AI image weakness — humans have an intuitive understanding of relative terms and can iterate on them.

    So, it’s pretty easy for me to point at an image and ask a human artist to “make the character’s breasts larger” or “make the character’s breasts smaller”. A human artist can look at an image, form a mental model of the image, and produce a new image in their head relative to the existing one by using my relative terms “larger” and “smaller”. They can then go create that new image. Humans, with their sophisticated mental model of the world, are good at that.

    But we haven’t trained an understanding of relative relationships into diffusion models today, and doing so would probably require a more sophisticated — maybe vastly more sophisticated — type of AI. “Larger” and “smaller” aren’t really usable as things stand today. Because breast size is something that people often want to muck with, people have trained models on a static list of danbooru tags for breast sizes, and models trained on those can use them as inputs, but even then, it’s a relatively-limited capability. And for most other properties of a character or thing, even that’s not available.

    For models which support it, prompt term weighting can sometimes provide a very limited analog to this. Instead of saying “make the image less scary”, maybe I “decrease the weight of the token ‘scary’ by 0.1”. But that doesn’t work with all relationships, and the outcome isn’t always fantastic even then.