Occasionally we get asked to provide random samples from our database, and it’s the sort of thing that can be done very easily in SAS.

For example, here’s a SAS routine I have often used to extract a sample *sampleset* of size 100 without replacement from a large dataset *bigset *(I can’t remember where this code originally came from but I definitely can’t take credit for it):

data sampleset (drop=remainder size);

size=100;

remainder=totobs;

do while (size > 0);

pickit + 1;

if ranuni(0) < size/remainder then do;
set bigset point=pickit nobs=totobs;
output;
size=size - 1;
end;
remainder=remainder - 1;
end;
stop;
run;[/sourcecode]
While this is usually sufficient, I have sometimes wondered whether it would be possible to use the same logic to extract a sample in VBA. So tonight I decided to test it out. The following code appears to do the trick quite nicely, but be warned, I haven't done a lot of testing yet.
[sourcecode language='vb']Public Sub SampleArray(ByRef avBigset As Variant, ByRef avSmallset As Variant)
' SampleArray populates a random sample avSmallset from an array avBigset
' without replacement (each element in avBigset is considered once only)
Dim lRemainder As Long
Dim lSize As Long
Dim lOb As Long
Dim lPickit As Long
' Make sure we're dealing with arrays...
If Not IsArray(avBigset) Or Not IsArray(avSmallset) Then Exit Sub
' Initialise
lRemainder = UBound(avBigset)
lSize = UBound(avSmallset)
Randomize 0
Do While lSize > 0 ‘ Still some left to pick up

lOb = lOb + 1

If Rnd < lSize / lRemainder Then
lPickit = lPickit + 1
avSmallset(lPickit) = avBigset(lOb)
lSize = lSize - 1
End If
lRemainder = lRemainder - 1
Loop ' Sample complete
End Sub[/sourcecode]
It coped fairly well with populating a sample array of size 1,000 from an array of size 10,000,000, but as I say I haven't tested too extensively so I don't yet know its limits, or whether there is a much more efficient method out there. Has anyone else used VBA in this way? What sort of method would you use?

## One thought on “Random Sampling in VBA”