@artificialmind@fosstodon.org avatar

artificialmind

@[email protected]

C++ library author, hobbyist programming language architect, obsessive optimizer

This profile is from a federated server and may be incomplete. View on remote instance

dotstdy , to random
@dotstdy@mastodon.social avatar

CPU optimisation guide: You should try vectorizing
GPU optimisation guide: You should try scalarizing

CHOOSE A LANE

artificialmind ,
@artificialmind@fosstodon.org avatar

@dotstdy @aras so what's next? running lisp on an intel CPU and running x86 asm on a lisp machine?

pervognsen , (edited ) to random
@pervognsen@mastodon.social avatar

It looks a bit funny but Rc<Arc<T>> seems like a reasonable choice in a lot of cases. Specifically, you have locally shared ownership of a remotely shared resource instead of directly sharing ownership of the remote resource (which comes with contention issues). Most of the time you probably wouldn't literally have Rc<Arc<T>> but Rc<LocalStruct> where LocalStruct (transitively) has an Arc<T>. But same thing really.

artificialmind ,
@artificialmind@fosstodon.org avatar

@pervognsen @SonnyBonds they are two pointers but if you use std::make_shared (which is the idiomatic way nowadays), then it only does a single allocation where control and data block are adjacent.

artificialmind ,
@artificialmind@fosstodon.org avatar

@pervognsen @SonnyBonds Yep and it was (at least in my perception) always advertised and taught as "the modern/idiomatic way" when using smart pointers. std::make_shared also has some exception-safety benefits where a throwing ctor doesn't lead to leaking memory.

The only real "downside" with shared allocation is that weak pointers can keep the data allocation alive even if the data itself is not accessible anymore. But I haven't encountered that issue in real code yet.

castano , to random
@castano@mastodon.gamedev.place avatar

This is the performance regression that I discovered a while ago:

https://github.com/KhronosGroup/SPIRV-Tools/issues/5658

If you are targeting Android devices, make use of fp16 math and you are using a recent build of spirv-opt (last two years), then you are most likely affected. The proposed fix should give you a noticeable performance boost.

artificialmind ,
@artificialmind@fosstodon.org avatar

@aras @castano I have occasional nightmares from "x² - x²" being nonzero because the compiler helpfully optimized it to "fmsub(x*x, x, x)"

lritter , to random
@lritter@mastodon.gamedev.place avatar

the only developer surpassing even the full stack developer in competence is the developer whose knowledge, like a lightning bolt, reaches from the ivory tower high heavens of abstract computer science down to the deepest turing tarpits of hell

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter thanks for reminding me that Dependently Typed Assembly Language exists!

cliffle , to random
@cliffle@hachyderm.io avatar

How I feel listening to programmers concerned that floating point math is non-commutative

https://www.smbc-comics.com/comic/commute-2

artificialmind ,
@artificialmind@fosstodon.org avatar

@pkhuong @cliffle yeah at first I thought it's the usual confusion: floating point math is unexpectedly non-associative, not non-commutative. The comic is clearly about commutativity though so now I'm not sure.

lritter , to random
@lritter@mastodon.gamedev.place avatar

i'm in my 40s now. not feeling a whole lot of nostalgia for past times yet. tbh it doesn't feel like much has changed. most of it is just more optimized. and for some things, i'd rather wish they finally disappear for good than come back again.

artificialmind ,
@artificialmind@fosstodon.org avatar

@pervognsen @lritter I miss snappy local applications without web-first philosophy though

lritter , to random
@lritter@mastodon.gamedev.place avatar

with a heap binned by size, heap fragmentation isn't possible (only in that, you're unlikely to get allocations that are right next to each other).

the 48 bit addressing range makes a heap with under 600 bins feasible (each encoding two bits of an arbitrary size), each of about 64GB in size.

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter I'm drawn to requiring 48bit+ virtual memory for my language as well but I'm worried about portability in practice. Have you thought about how that decision limits you? From my current understanding, this scheme would not work properly in wasm, right?

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter I hope someone more knowledgeable corrects me but my understanding is: "normal" wasm right now has a 32 bit non-virtual growable linear memory model ONLY. There is https://github.com/WebAssembly/memory64 to extend that to 64 bit, but this is still non-virtual and grow-only. There is a proposal for multi-memory https://github.com/WebAssembly/multi-memory/blob/main/proposals/multi-memory/Overview.md which might make one-bin-per-memory feasible. Still no shrinking.

https://github.com/WebAssembly/memory-control is proposed which sounds like vmem...

lritter , to random
@lritter@mastodon.gamedev.place avatar

dichotomies are the gateway drug to trichotomies

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter and once you're fully off the deep end you start understanding nuance

lritter , to random
@lritter@mastodon.gamedev.place avatar

if two threads alter bytes that are right next to each other, are they going to affect each other? or are writes byte-atomic?

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter and the even harder version of that question: what if you have 4B stores (e.g. an u32) that overlap for 3B, i.e. unaligned stores. What happens then?

lritter , to random
@lritter@mastodon.gamedev.place avatar

ok. so as i understand it, docker is one of many libcontainer frontends that use host virtualization to sandbox OS instances. that's more value than just "shipping someone's machine": it gives you a glovebox to operate in so that you can separate processes by interest, and untrusted dependencies can't hack your credentials.

you might want to keep a ssh key in there so you can push to your repos - but that's already a liability. locally share your folders instead, push in a different host.

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter isn't that the use case for access tokens instead of ssh keys?

artificialmind ,
@artificialmind@fosstodon.org avatar

@lritter they can have an expiration date but my understanding is that their "main feature" is that they grant granular access. iirc, our CI workers get a single-project read-only token to clone a repo. Each worker has their own token (and can thus be invalidated individually) but the expiration is long (like a year I think).

My physical dev environment has a normal ssh key, but any worker/service that needs to interact with repos only gets tokens.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • tech
  • kbinEarth
  • testing
  • interstellar
  • wanderlust
  • All magazines