Thoughts on Collections 
Author Message
 Thoughts on Collections

  I have a class where I would like to store
UDTs in collections, but since the class
is private I can't do that. This is a class
to organize data and make it easily searchable.
What I've come up with as the simplest
approach to what I need is a collection that holds
collections. The main collection might hold up to
100 sub-collections. Each sub-collection might
have as many as 30 items, with each item being a
boolean value. I could do this as arrays. It's
just a bit more tedious to code.

   Any thoughts on that in terms of efficiency?
I worry about bogging down with too many
collections. And while I'm on it, I wonder
exactly what a collection is. According to
O'Reilly's VB book a collection is fairly slow.
I'm not really worried about that, but I do wonder
about memory usage. Would a collection be an
object pointer to an array of double 4-byte
values, with each pair being a string pointer
key with long value or pointer item? If that's
the case then a Collection would seem to be
fairly efficient and lean.



Tue, 08 May 2012 10:32:38 GMT  
 Thoughts on Collections

Quote:
>    I have a class where I would like to store
> UDTs in collections, but since the class
> is private I can't do that. This is a class
> to organize data and make it easily searchable.
> What I've come up with as the simplest
> approach to what I need is a collection that holds
> collections. The main collection might hold up to
> 100 sub-collections. Each sub-collection might
> have as many as 30 items, with each item being a
> boolean value. I could do this as arrays. It's
> just a bit more tedious to code.

Errrr, why??
Why not use a class instead?
This is the closest to a type that is supported by collections and is
much more useable (and type safe)

--

i-Catcher Development Team

iCode Systems



Tue, 08 May 2012 17:26:39 GMT  
 Thoughts on Collections


Quote:
>   I have a class where I would like to store
> UDTs in collections, but since the class
> is private I can't do that. This is a class
> to organize data and make it easily searchable.
> What I've come up with as the simplest
> approach to what I need is a collection that holds
> collections. The main collection might hold up to
> 100 sub-collections. Each sub-collection might
> have as many as 30 items, with each item being a
> boolean value. I could do this as arrays. It's
> just a bit more tedious to code.

>    Any thoughts on that in terms of efficiency?
> I worry about bogging down with too many
> collections. And while I'm on it, I wonder
> exactly what a collection is. According to
> O'Reilly's VB book a collection is fairly slow.
> I'm not really worried about that, but I do wonder
> about memory usage. Would a collection be an
> object pointer to an array of double 4-byte
> values, with each pair being a string pointer
> key with long value or pointer item? If that's
> the case then a Collection would seem to be
> fairly efficient and lean.

"Collections" are nothing more than a group of data arranged so you can use
the same model or operations to work with all the elements. This includes
everything from a simple array to a relational database. (Yep, all SQL
Server is is a super 'collection'.
Back in the day of "real" programming these groups and methods for handling
them were called "Data Structures" and programmers spent most of their time
learning and cataloging them.

Take a look at:
http://en.wikibooks.org/wiki/Data_Structures

There are only something this side of a zillion options - take your pick.
But first you might want to narrow down your selection. So forget "How" (ie,
UDTs, classes, arrays, VB Colection, ...) for a bit, take a yellow pad and
start drawing "What" it is you want to do. Until you define how you intend
to add, modify, search, iterate, ... blah, blah, your data - no one can
provide a valid answer.

The result will eventually be expressed using various methods (Hows) but do
go there yet.

In the meantime consider the following:
The VB Collection Object is slooooww compared to other options. The VB
Collection was supplied as a convenience, it is not highly optimized built
as it is as a general purpose tool. (It also serves as two very different
Data Structures but that is another story. If memory or performance becomes
a concern you will likely be looking for a custom solution.

For the amount of data you are talking about - a simple ready to run out of
the box database product would seem the best route. You could also use a
custom written one.

-ralph



Tue, 08 May 2012 17:30:41 GMT  
 Thoughts on Collections


Quote:
> The main collection might hold up to
> 100 sub-collections. Each sub-collection might
> have as many as 30 items, with each item being a
> boolean value. I could do this as arrays. It's
> just a bit more tedious to code.

>    Any thoughts on that in terms of efficiency?

The UDT is helpful to store data efficiently, while
the collection is a glorified array of Variants.  If you
move to collections, you would see efficiency go out
the window....

I noticed that your data is all in the form of boolean
variables.  If that is the case you could use 1 bit to
denote True or False, in a value large enough to store
up to 30 items. (30 bits).  One array of Longs would
be able to hold all your data (while also being easy
to read and write to disk).

Similar to what Ralph said, I advise people to store
their data in a form that makes it easy for the program
to use.  What good is efficiency if you have to triple
the code needed, every time you want to access it?

So like Ralph said, decide what you need to do with
the data, and then look for storage methods that will
make those things easy to do....

For an example of using 1 bit to store booleans, consider
the code below.  Class1 contains all the code necessary
(less error handling) to manage the array of data.  I'd say
it isn't tedious at all.

To get at the data you simply supply an Index and
the named value you want (the names being those you
used in the UDT, I used A1, A2, B1, B2, etc...)

Plus Intellisense helps you out along the way....

HTH
LFS

' Form code - - - - - - - - -
Private Sub Form_Load()
  Dim D As Class1
  Set D = New Class1

  D.Item(1, A1) = True
  Debug.Print D.Item(0, A2)
  Debug.Print D.Item(1, A1)
End Sub

' Class1 code  - - - - - - - -
Public Enum Flag
  A1 = 1
  A2 = 2
  B1 = 4
  B2 = 8
  C1 = 16
  C2 = 32
  ' ... etc
End Enum

Private Data(0 To 100) As Long

Public Property Get Item(ByVal Index As Long, ByVal Name As Flag) As Boolean
  Item = CBool(Data(Index) And Name)
End Property

Public Property Let Item(ByVal Index As Long, ByVal Name As Flag, ByVal Value As Boolean)
  If Value Then
    Data(Index) = Data(Index) Or Name
  Else
    Data(Index) = Data(Index) And (Not Name)
  End If
End Property



Tue, 08 May 2012 21:25:32 GMT  
 Thoughts on Collections
  Thanks all. I am actually trying a class now.
That seemed bulky, but it is convenient. And
I once read that a class adds 90 bytes. (Or is
it 96?). That seems fairly inconsequential if
the number of classes is < 1000 or so.

  I basically need to store groups of strings,
with each having an associated boolean value.
Like so:

Fruit:  apple - true
          orange - false

Grain: bread - false
          rice - true
          corn - false
etc.

  I want to easily return a group list (bread, rice, corn),
check the boolean value for a given string value
(rice = true), etc. If I put that into a class I only
need two simple arrays in the class for storage. I
can then add the classes to a collection in an
over-class, which allows me to call the over-class
with clear methods like GetBool("Grain", "rice").

  There seems to be wide agreement that VB
Collections are not so hot. I didn't realize that
they were actually storing variants. Usually I
use arrays wherever possible because I figure that
they're the most efficient. And inside a class they
can be hidden. But it can get convoluted when there
are arrays in arrays, etc., even if the resulting class
interface is intuitive and usable. So I was interested
in others' ideas.



Tue, 08 May 2012 23:28:29 GMT  
 Thoughts on Collections



Quote:
>   Thanks all. I am actually trying a class now.
> That seemed bulky, but it is convenient. And
> I once read that a class adds 90 bytes. (Or is
> it 96?). That seems fairly inconsequential if
> the number of classes is < 1000 or so.

>   I basically need to store groups of strings,
> with each having an associated boolean value.
> Like so:

> Fruit:  apple - true
>           orange - false

> Grain: bread - false
>           rice - true
>           corn - false
> etc.

What is it, what you really want to achieve?
If for example in "one given fruit-group-instance" there's
only one "value = true" and all the others are false (to
identify a fruit) ... did you already thought about
using Enums instead? If you use Powers of 2 for
your Enum-Values, then you could even combine
different "bool-properties" - effectively simulating
Larrys suggestion of a Bitset, stored in a Long-Value
(Enums are 32Bit-types).

But given your other requirements...

Quote:
> ... which allows me to call the over-class
> with clear methods like GetBool("Grain", "rice").

...especially when you possibly plan, to persist the
data somewhere, then a DB comes to mind, using
SQL for all kind of queries (in a very flexible and
comfortable way).

The question (as always) is, what tradeoff you are
willing to accept - a DB-solution would require
some Dlls in your deployment (about 1MB or so)
and it would raise the MemConsumption of your
App in a range of 2MB to 3MB (due to the loaded
DB-engine), + additional Memory, depending on how
many Recordsets you want to keep open, or how
large your "total Data-Set" is (or will grow), in case
you want to hold your Data in an InMemory-DB.
But the benefits are less lines of code in your App
(mostly due to the already powerful data-container-
classes - the Recordsets - which you will not have
to implement yourself) ... + the better and easier
extensibility of the "Data-Model" and the larger
flexibility regarding the queries (just change some
text in a given SQL-string, to adapt to further
query- and filter- or sorting-requirements).

Larrys suggestions regarding BitFields for the boolean
values + using Types and Arrays etc. is good advise,
if your total data-set is huge (then BitFields would
worth the effort, with regards to mem-consumption) -
and/or if performance matters (typed Arrays instead
of Classes stored in Collections). But as I read it
from your description, neither one is the case - your
total Data-Set is more or less a smaller one - and also
performance seems not to be critical - so in that case
you should (or could) use an approach that is flexible
and easier to develop/enhance (with less LOC).
A VB-Classic-App (a small Exe) has some Base-Mem-
Overhead of typically 3-4MB - given that, you should
not waste "hours of coding", just to safe some 100kByte (or
let's make that 1MB), if the alternative solution is more
"generic", more dynamically adjustable to further requirements
and "just finished earlier".

Olaf



Wed, 09 May 2012 00:27:24 GMT  
 Thoughts on Collections

Quote:
>  I basically need to store groups of strings,
> with each having an associated boolean value.
> Like so:

> Fruit:  apple - true
>          orange - false

> Grain: bread - false
>          rice - true
>          corn - false
> etc.

I would use Enum for the entries unless they are user-modifiable.
Collections are slow when you add items to them, but they are fast when you
retrieve items by Key. That's because the Collection object uses a hash
table of the keys you enter(the keys themselves are not stored, but "item"
parameter is), and then uses a fast search algorithm to search the hash and
return the "item".

http://en.wikipedia.org/wiki/Hash_table

To see how much memory they use, add one million entries, then see your app
memory usage, and divide by the number of entries.

Also, when adding items, you could use just use one collection and use the
format "Category:Item" when specifying the key. Example:

col.Add CByte(1), "Fruit:apple"
col.Add CByte(0), "Fruit:orange"
col.Add CByte(0), "Grain:bread"
col.Add CByte(1), "Grain:rice"
col.Add CByte(0), "Grain:corn"

You could make that into an AddItem routine to simplify things:

Public Sub AddItem(ByVal bItem As Boolean, ByRef sKey As String)
    col.Add CByte(bItem), sKey
End Sub

How to get True/False value:

bResult = CBoolean(col("Grain:corn"))



Wed, 09 May 2012 01:05:37 GMT  
 Thoughts on Collections


Quote:

>   I basically need to store groups of strings,
> with each having an associated boolean value.
> Like so:

> Fruit:  apple - true
>           orange - false

> Grain: bread - false
>           rice - true
>           corn - false
> etc.

>   I want to easily return a group list (bread, rice, corn),
> check the boolean value for a given string value
> (rice = true), etc. If I put that into a class I only
> need two simple arrays in the class for storage.

Do you really need two arrays?  Keeping associated
data in different structures has to be managed carefully
to keep them 'in sync'.  Consider:

APPLE orange
bread RICE corn

There you have two lists, including a means to differentiate
between the items:

' return a list
Value = LCase$(list)

' return a value
Value = (item = UCase$(item))

Just some 'food for thought'...   ;-)

LFS



Wed, 09 May 2012 03:32:07 GMT  
 Thoughts on Collections

Quote:
> Do you really need two arrays?  Keeping associated
> data in different structures has to be managed carefully
> to keep them 'in sync'.

  Yes, but I figured on adding to both at the same
time. And I can't use the case ID because the names
need to keep their case. I also wanted to avoid
string operations. (Nobody had suggested an interesting
way to add all data to a single collection item by
concatenating two strings, but that means a lot of
string building behind the scenes.)

   But now I have another idea. I originally
wanted to use an array of UDTs. That would make
it easy, orderly, and clear to pass data around.
The problem with that, of course, is that I can't
use a UDT as a parameter if it's declared in a
private class or a .bas module. But there's something
I hadn't thought of:
  It works to declare the UDT in a public-not-creatable
class. (This is a DLL so that's an option.)  It seems
bizarre that I can't use a globally declared UDT in a
private class, but can if it's declared in a public class
that never gets instantiated. But it works. Ideally I'd
like to mark that class hidden for typelib readers,
but VB doesn't seem to provide that option. I guess
the best I can do is to just name it something
"unpleasant", like "C__" and provide no method to
access an instance.



Wed, 09 May 2012 05:53:40 GMT  
 Thoughts on Collections

Quote:
> The problem with that, of course, is that I can't
> use a UDT as a parameter if it's declared in a
> private class

Yes you can. Use "Friend" instead of "Public". See this post for details:

http://groups.google.com/group/microsoft.public.vb.general.discussion...



Wed, 09 May 2012 07:01:23 GMT  
 Thoughts on Collections
   I know about Friend but it doesn't seem to
apply to UDTs. If I declare Friend Type
I get an error "only valid for functions, subs or
properties". I can declare it public, but it won't
work in a private class. The class has to be
public. Using a type2 class works fine, but it
would be nice to make it hidden, and I doubt
that I can locate the flag... in the typelib... in
the compiled DLL... to perform such an edit.

Quote:
> > The problem with that, of course, is that I can't
> > use a UDT as a parameter if it's declared in a
> > private class

> Yes you can. Use "Friend" instead of "Public". See this post for details:

http://groups.google.com/group/microsoft.public.vb.general.discussion...
476df93b1d077a


Wed, 09 May 2012 08:09:16 GMT  
 Thoughts on Collections


Quote:
>    But now I have another idea. I originally
> wanted to use an array of UDTs. That would make
> it easy, orderly, and clear to pass data around.

I am curious as to how the data you posted will fit
into a array of UDTs.

eg:
Fruit:  apple - true
          orange - false

Grain: bread - false
          rice - true
          corn - false
etc.

???
LFS



Wed, 09 May 2012 08:05:44 GMT  
 Thoughts on Collections


Quote:
> > Do you really need two arrays?  Keeping associated
> > data in different structures has to be managed carefully
> > to keep them 'in sync'.

>   Yes, but I figured on adding to both at the same
> time. And I can't use the case ID because the names
> need to keep their case. I also wanted to avoid
> string operations. (Nobody had suggested an interesting
> way to add all data to a single collection item by
> concatenating two strings, but that means a lot of
> string building behind the scenes.)

>    But now I have another idea. I originally
> wanted to use an array of UDTs. That would make
> it easy, orderly, and clear to pass data around.
> The problem with that, of course, is that I can't
> use a UDT as a parameter if it's declared in a
> private class or a .bas module. But there's something
> I hadn't thought of:
>   It works to declare the UDT in a public-not-creatable
> class. (This is a DLL so that's an option.)  It seems
> bizarre that I can't use a globally declared UDT in a
> private class, but can if it's declared in a public class
> that never gets instantiated. But it works. Ideally I'd
> like to mark that class hidden for typelib readers,
> but VB doesn't seem to provide that option. I guess
> the best I can do is to just name it something
> "unpleasant", like "C__" and provide no method to
> access an instance.

Declare the UDT as a Struct in a TypeLibrary

-ralph



Wed, 09 May 2012 09:15:53 GMT  
 Thoughts on Collections
mayayana escribi:

Quote:
>    I know about Friend but it doesn't seem to
> apply to UDTs. If I declare Friend Type
> I get an error "only valid for functions, subs or
> properties". I can declare it public, but it won't
> work in a private class. The class has to be
> public. Using a type2 class works fine, but it
> would be nice to make it hidden, and I doubt
> that I can locate the flag... in the typelib... in
> the compiled DLL... to perform such an edit.

>>> The problem with that, of course, is that I can't
>>> use a UDT as a parameter if it's declared in a
>>> private class
>> Yes you can. Use "Friend" instead of "Public". See this post for details:

> http://groups.google.com/group/microsoft.public.vb.general.discussion...
> 476df93b1d077a

You could copy the UDT with CopyMemory, then you need to pass just a
Long value that you get with VarPtr:

http://groups.google.com/group/microsoft.public.vb.general.discussion...



Wed, 09 May 2012 10:09:15 GMT  
 Thoughts on Collections

Quote:

> I am curious as to how the data you posted will fit
> into a array of UDTs.

  It doesn't quite fit like that.

   I just wanted UDTs for the string-boolean pair.
(And as it turns out a third member will be handy
in that UDT.)

   What I've got at this point is a class with 4 arrays:

ArrayofGroupNames
ArrayOfGroupItemUDTArrays
ArrayToHoldNumberofItemsPerGroup

A 4th array holds ubound of the item arrays

I keep track of the number of groups, and the
Group index in the group name array is also the
corresponding index into the array of UDT arrays
to hold item data.

   Each time I add a group I check for the need
to redim. Then instead of GroupExists I'm using
GroupIndex. GroupExists loops through the group
names array and returns the index or -1.  -1 means
it doesn't exist. To add an item I check the GroupIndex
property, then add the item (UDT) to the corresponding
index position of the UDT array storage array.

  By adjusting counters whenever a group or item
is added, I always know the number of groups and
the number of items in each group.

  It's ugly and abstruse inside the class. That's why
I was looking into other ways to approach it. But I
think it must be quite efficient since it's just arrays,
and it can be intuitively designed in terms of the
interface methods.

Quote:
> eg:
> Fruit:  apple - true
>           orange - false

> Grain: bread - false
>           rice - true
>           corn - false
> etc.

> ???
> LFS



Wed, 09 May 2012 10:55:16 GMT  
 
 [ 21 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Collections, Collections and More Collections

2. Document collections and AllForms collections

3. Creating a CDO collection from an Outlook collection

4. using collection of collections

5. Storing a collection in a collection

6. collection in collection

7. Migrating to System.Collections.IEnumerator from VB6.Collection

8. Persisting a Collection of a Collection

9. Collection of Collections?

10. Collections, swapping items in a collection

11. Collections of Collections..how?

12. collections of collections

 

 
Powered by phpBB® Forum Software