You don't need to calculate the average number of active users. If you do, it will be wasted resources and you probably will miss a few dozen.
Simply request the instance's NodeInfo.
NodeInfo 2.1 (which Lemmy does implement) and I think 2.0 as well require implementors to provide correct user usage statistics. So you have total users and average active users per month/half year calculated on request.
This also means you can provide this service for other platforms that support NodeInfo.
Making a GET request to /.well-known/nodeinfo
will give you the links to the instance's NodeInfo documents.
In fact, you can recursively begin from some random known instance, get a list of other instances it is federated with, get their NodeInfo and repeat the process. NodeInfo also provides the name of the software (check schema).
You can use that.
FEP: https://codeberg.org/fediverse/fep/src/branch/main/fep/0151/fep-0151.md