Author |
Message |
mayayan #1 / 21
|
 Thoughts on Collections
I have a class where I would like to store UDTs in collections, but since the class is private I can't do that. This is a class to organize data and make it easily searchable. What I've come up with as the simplest approach to what I need is a collection that holds collections. The main collection might hold up to 100 sub-collections. Each sub-collection might have as many as 30 items, with each item being a boolean value. I could do this as arrays. It's just a bit more tedious to code. Any thoughts on that in terms of efficiency? I worry about bogging down with too many collections. And while I'm on it, I wonder exactly what a collection is. According to O'Reilly's VB book a collection is fairly slow. I'm not really worried about that, but I do wonder about memory usage. Would a collection be an object pointer to an array of double 4-byte values, with each pair being a string pointer key with long value or pointer item? If that's the case then a Collection would seem to be fairly efficient and lean.
|
Tue, 08 May 2012 10:32:38 GMT |
|
 |
Dee Earle #2 / 21
|
 Thoughts on Collections
Quote: > I have a class where I would like to store > UDTs in collections, but since the class > is private I can't do that. This is a class > to organize data and make it easily searchable. > What I've come up with as the simplest > approach to what I need is a collection that holds > collections. The main collection might hold up to > 100 sub-collections. Each sub-collection might > have as many as 30 items, with each item being a > boolean value. I could do this as arrays. It's > just a bit more tedious to code.
Errrr, why?? Why not use a class instead? This is the closest to a type that is supported by collections and is much more useable (and type safe) --
i-Catcher Development Team iCode Systems
|
Tue, 08 May 2012 17:26:39 GMT |
|
 |
Ralp #3 / 21
|
 Thoughts on Collections
Quote: > I have a class where I would like to store > UDTs in collections, but since the class > is private I can't do that. This is a class > to organize data and make it easily searchable. > What I've come up with as the simplest > approach to what I need is a collection that holds > collections. The main collection might hold up to > 100 sub-collections. Each sub-collection might > have as many as 30 items, with each item being a > boolean value. I could do this as arrays. It's > just a bit more tedious to code. > Any thoughts on that in terms of efficiency? > I worry about bogging down with too many > collections. And while I'm on it, I wonder > exactly what a collection is. According to > O'Reilly's VB book a collection is fairly slow. > I'm not really worried about that, but I do wonder > about memory usage. Would a collection be an > object pointer to an array of double 4-byte > values, with each pair being a string pointer > key with long value or pointer item? If that's > the case then a Collection would seem to be > fairly efficient and lean.
"Collections" are nothing more than a group of data arranged so you can use the same model or operations to work with all the elements. This includes everything from a simple array to a relational database. (Yep, all SQL Server is is a super 'collection'. Back in the day of "real" programming these groups and methods for handling them were called "Data Structures" and programmers spent most of their time learning and cataloging them. Take a look at: http://en.wikibooks.org/wiki/Data_Structures There are only something this side of a zillion options - take your pick. But first you might want to narrow down your selection. So forget "How" (ie, UDTs, classes, arrays, VB Colection, ...) for a bit, take a yellow pad and start drawing "What" it is you want to do. Until you define how you intend to add, modify, search, iterate, ... blah, blah, your data - no one can provide a valid answer. The result will eventually be expressed using various methods (Hows) but do go there yet. In the meantime consider the following: The VB Collection Object is slooooww compared to other options. The VB Collection was supplied as a convenience, it is not highly optimized built as it is as a general purpose tool. (It also serves as two very different Data Structures but that is another story. If memory or performance becomes a concern you will likely be looking for a custom solution. For the amount of data you are talking about - a simple ready to run out of the box database product would seem the best route. You could also use a custom written one. -ralph
|
Tue, 08 May 2012 17:30:41 GMT |
|
 |
Larry Serflate #4 / 21
|
 Thoughts on Collections
Quote: > The main collection might hold up to > 100 sub-collections. Each sub-collection might > have as many as 30 items, with each item being a > boolean value. I could do this as arrays. It's > just a bit more tedious to code. > Any thoughts on that in terms of efficiency?
The UDT is helpful to store data efficiently, while the collection is a glorified array of Variants. If you move to collections, you would see efficiency go out the window.... I noticed that your data is all in the form of boolean variables. If that is the case you could use 1 bit to denote True or False, in a value large enough to store up to 30 items. (30 bits). One array of Longs would be able to hold all your data (while also being easy to read and write to disk). Similar to what Ralph said, I advise people to store their data in a form that makes it easy for the program to use. What good is efficiency if you have to triple the code needed, every time you want to access it? So like Ralph said, decide what you need to do with the data, and then look for storage methods that will make those things easy to do.... For an example of using 1 bit to store booleans, consider the code below. Class1 contains all the code necessary (less error handling) to manage the array of data. I'd say it isn't tedious at all. To get at the data you simply supply an Index and the named value you want (the names being those you used in the UDT, I used A1, A2, B1, B2, etc...) Plus Intellisense helps you out along the way.... HTH LFS ' Form code - - - - - - - - - Private Sub Form_Load() Dim D As Class1 Set D = New Class1 D.Item(1, A1) = True Debug.Print D.Item(0, A2) Debug.Print D.Item(1, A1) End Sub ' Class1 code - - - - - - - - Public Enum Flag A1 = 1 A2 = 2 B1 = 4 B2 = 8 C1 = 16 C2 = 32 ' ... etc End Enum Private Data(0 To 100) As Long Public Property Get Item(ByVal Index As Long, ByVal Name As Flag) As Boolean Item = CBool(Data(Index) And Name) End Property Public Property Let Item(ByVal Index As Long, ByVal Name As Flag, ByVal Value As Boolean) If Value Then Data(Index) = Data(Index) Or Name Else Data(Index) = Data(Index) And (Not Name) End If End Property
|
Tue, 08 May 2012 21:25:32 GMT |
|
 |
mayayan #5 / 21
|
 Thoughts on Collections
Thanks all. I am actually trying a class now. That seemed bulky, but it is convenient. And I once read that a class adds 90 bytes. (Or is it 96?). That seems fairly inconsequential if the number of classes is < 1000 or so. I basically need to store groups of strings, with each having an associated boolean value. Like so: Fruit: apple - true orange - false Grain: bread - false rice - true corn - false etc. I want to easily return a group list (bread, rice, corn), check the boolean value for a given string value (rice = true), etc. If I put that into a class I only need two simple arrays in the class for storage. I can then add the classes to a collection in an over-class, which allows me to call the over-class with clear methods like GetBool("Grain", "rice"). There seems to be wide agreement that VB Collections are not so hot. I didn't realize that they were actually storing variants. Usually I use arrays wherever possible because I figure that they're the most efficient. And inside a class they can be hidden. But it can get convoluted when there are arrays in arrays, etc., even if the resulting class interface is intuitive and usable. So I was interested in others' ideas.
|
Tue, 08 May 2012 23:28:29 GMT |
|
 |
Schmid #6 / 21
|
 Thoughts on Collections
Quote: > Thanks all. I am actually trying a class now. > That seemed bulky, but it is convenient. And > I once read that a class adds 90 bytes. (Or is > it 96?). That seems fairly inconsequential if > the number of classes is < 1000 or so. > I basically need to store groups of strings, > with each having an associated boolean value. > Like so: > Fruit: apple - true > orange - false > Grain: bread - false > rice - true > corn - false > etc.
What is it, what you really want to achieve? If for example in "one given fruit-group-instance" there's only one "value = true" and all the others are false (to identify a fruit) ... did you already thought about using Enums instead? If you use Powers of 2 for your Enum-Values, then you could even combine different "bool-properties" - effectively simulating Larrys suggestion of a Bitset, stored in a Long-Value (Enums are 32Bit-types). But given your other requirements... Quote: > ... which allows me to call the over-class > with clear methods like GetBool("Grain", "rice").
...especially when you possibly plan, to persist the data somewhere, then a DB comes to mind, using SQL for all kind of queries (in a very flexible and comfortable way). The question (as always) is, what tradeoff you are willing to accept - a DB-solution would require some Dlls in your deployment (about 1MB or so) and it would raise the MemConsumption of your App in a range of 2MB to 3MB (due to the loaded DB-engine), + additional Memory, depending on how many Recordsets you want to keep open, or how large your "total Data-Set" is (or will grow), in case you want to hold your Data in an InMemory-DB. But the benefits are less lines of code in your App (mostly due to the already powerful data-container- classes - the Recordsets - which you will not have to implement yourself) ... + the better and easier extensibility of the "Data-Model" and the larger flexibility regarding the queries (just change some text in a given SQL-string, to adapt to further query- and filter- or sorting-requirements). Larrys suggestions regarding BitFields for the boolean values + using Types and Arrays etc. is good advise, if your total data-set is huge (then BitFields would worth the effort, with regards to mem-consumption) - and/or if performance matters (typed Arrays instead of Classes stored in Collections). But as I read it from your description, neither one is the case - your total Data-Set is more or less a smaller one - and also performance seems not to be critical - so in that case you should (or could) use an approach that is flexible and easier to develop/enhance (with less LOC). A VB-Classic-App (a small Exe) has some Base-Mem- Overhead of typically 3-4MB - given that, you should not waste "hours of coding", just to safe some 100kByte (or let's make that 1MB), if the alternative solution is more "generic", more dynamically adjustable to further requirements and "just finished earlier". Olaf
|
Wed, 09 May 2012 00:27:24 GMT |
|
 |
Nobod #7 / 21
|
 Thoughts on Collections
Quote: > I basically need to store groups of strings, > with each having an associated boolean value. > Like so: > Fruit: apple - true > orange - false > Grain: bread - false > rice - true > corn - false > etc.
I would use Enum for the entries unless they are user-modifiable. Collections are slow when you add items to them, but they are fast when you retrieve items by Key. That's because the Collection object uses a hash table of the keys you enter(the keys themselves are not stored, but "item" parameter is), and then uses a fast search algorithm to search the hash and return the "item". http://en.wikipedia.org/wiki/Hash_table To see how much memory they use, add one million entries, then see your app memory usage, and divide by the number of entries. Also, when adding items, you could use just use one collection and use the format "Category:Item" when specifying the key. Example: col.Add CByte(1), "Fruit:apple" col.Add CByte(0), "Fruit:orange" col.Add CByte(0), "Grain:bread" col.Add CByte(1), "Grain:rice" col.Add CByte(0), "Grain:corn" You could make that into an AddItem routine to simplify things: Public Sub AddItem(ByVal bItem As Boolean, ByRef sKey As String) col.Add CByte(bItem), sKey End Sub How to get True/False value: bResult = CBoolean(col("Grain:corn"))
|
Wed, 09 May 2012 01:05:37 GMT |
|
 |
Larry Serflate #8 / 21
|
 Thoughts on Collections
Quote: > I basically need to store groups of strings, > with each having an associated boolean value. > Like so: > Fruit: apple - true > orange - false > Grain: bread - false > rice - true > corn - false > etc. > I want to easily return a group list (bread, rice, corn), > check the boolean value for a given string value > (rice = true), etc. If I put that into a class I only > need two simple arrays in the class for storage.
Do you really need two arrays? Keeping associated data in different structures has to be managed carefully to keep them 'in sync'. Consider: APPLE orange bread RICE corn There you have two lists, including a means to differentiate between the items: ' return a list Value = LCase$(list) ' return a value Value = (item = UCase$(item)) Just some 'food for thought'... ;-) LFS
|
Wed, 09 May 2012 03:32:07 GMT |
|
 |
mayayan #9 / 21
|
 Thoughts on Collections
Quote: > Do you really need two arrays? Keeping associated > data in different structures has to be managed carefully > to keep them 'in sync'.
Yes, but I figured on adding to both at the same time. And I can't use the case ID because the names need to keep their case. I also wanted to avoid string operations. (Nobody had suggested an interesting way to add all data to a single collection item by concatenating two strings, but that means a lot of string building behind the scenes.) But now I have another idea. I originally wanted to use an array of UDTs. That would make it easy, orderly, and clear to pass data around. The problem with that, of course, is that I can't use a UDT as a parameter if it's declared in a private class or a .bas module. But there's something I hadn't thought of: It works to declare the UDT in a public-not-creatable class. (This is a DLL so that's an option.) It seems bizarre that I can't use a globally declared UDT in a private class, but can if it's declared in a public class that never gets instantiated. But it works. Ideally I'd like to mark that class hidden for typelib readers, but VB doesn't seem to provide that option. I guess the best I can do is to just name it something "unpleasant", like "C__" and provide no method to access an instance.
|
Wed, 09 May 2012 05:53:40 GMT |
|
 |
Nobod #10 / 21
|
 Thoughts on Collections
Quote: > The problem with that, of course, is that I can't > use a UDT as a parameter if it's declared in a > private class
Yes you can. Use "Friend" instead of "Public". See this post for details: http://groups.google.com/group/microsoft.public.vb.general.discussion...
|
Wed, 09 May 2012 07:01:23 GMT |
|
 |
mayayan #11 / 21
|
 Thoughts on Collections
I know about Friend but it doesn't seem to apply to UDTs. If I declare Friend Type I get an error "only valid for functions, subs or properties". I can declare it public, but it won't work in a private class. The class has to be public. Using a type2 class works fine, but it would be nice to make it hidden, and I doubt that I can locate the flag... in the typelib... in the compiled DLL... to perform such an edit. Quote: > > The problem with that, of course, is that I can't > > use a UDT as a parameter if it's declared in a > > private class > Yes you can. Use "Friend" instead of "Public". See this post for details:
http://groups.google.com/group/microsoft.public.vb.general.discussion... 476df93b1d077a
|
Wed, 09 May 2012 08:09:16 GMT |
|
 |
Larry Serflate #12 / 21
|
 Thoughts on Collections
Quote: > But now I have another idea. I originally > wanted to use an array of UDTs. That would make > it easy, orderly, and clear to pass data around.
I am curious as to how the data you posted will fit into a array of UDTs. eg: Fruit: apple - true orange - false Grain: bread - false rice - true corn - false etc. ??? LFS
|
Wed, 09 May 2012 08:05:44 GMT |
|
 |
Ralp #13 / 21
|
 Thoughts on Collections
Quote: > > Do you really need two arrays? Keeping associated > > data in different structures has to be managed carefully > > to keep them 'in sync'. > Yes, but I figured on adding to both at the same > time. And I can't use the case ID because the names > need to keep their case. I also wanted to avoid > string operations. (Nobody had suggested an interesting > way to add all data to a single collection item by > concatenating two strings, but that means a lot of > string building behind the scenes.) > But now I have another idea. I originally > wanted to use an array of UDTs. That would make > it easy, orderly, and clear to pass data around. > The problem with that, of course, is that I can't > use a UDT as a parameter if it's declared in a > private class or a .bas module. But there's something > I hadn't thought of: > It works to declare the UDT in a public-not-creatable > class. (This is a DLL so that's an option.) It seems > bizarre that I can't use a globally declared UDT in a > private class, but can if it's declared in a public class > that never gets instantiated. But it works. Ideally I'd > like to mark that class hidden for typelib readers, > but VB doesn't seem to provide that option. I guess > the best I can do is to just name it something > "unpleasant", like "C__" and provide no method to > access an instance.
Declare the UDT as a Struct in a TypeLibrary -ralph
|
Wed, 09 May 2012 09:15:53 GMT |
|
 |
Eduard #14 / 21
|
 Thoughts on Collections
mayayana escribi: Quote: > I know about Friend but it doesn't seem to > apply to UDTs. If I declare Friend Type > I get an error "only valid for functions, subs or > properties". I can declare it public, but it won't > work in a private class. The class has to be > public. Using a type2 class works fine, but it > would be nice to make it hidden, and I doubt > that I can locate the flag... in the typelib... in > the compiled DLL... to perform such an edit. >>> The problem with that, of course, is that I can't >>> use a UDT as a parameter if it's declared in a >>> private class >> Yes you can. Use "Friend" instead of "Public". See this post for details: > http://groups.google.com/group/microsoft.public.vb.general.discussion... > 476df93b1d077a
You could copy the UDT with CopyMemory, then you need to pass just a Long value that you get with VarPtr: http://groups.google.com/group/microsoft.public.vb.general.discussion...
|
Wed, 09 May 2012 10:09:15 GMT |
|
 |
mayayan #15 / 21
|
 Thoughts on Collections
Quote: > I am curious as to how the data you posted will fit > into a array of UDTs.
It doesn't quite fit like that. I just wanted UDTs for the string-boolean pair. (And as it turns out a third member will be handy in that UDT.) What I've got at this point is a class with 4 arrays: ArrayofGroupNames ArrayOfGroupItemUDTArrays ArrayToHoldNumberofItemsPerGroup A 4th array holds ubound of the item arrays I keep track of the number of groups, and the Group index in the group name array is also the corresponding index into the array of UDT arrays to hold item data. Each time I add a group I check for the need to redim. Then instead of GroupExists I'm using GroupIndex. GroupExists loops through the group names array and returns the index or -1. -1 means it doesn't exist. To add an item I check the GroupIndex property, then add the item (UDT) to the corresponding index position of the UDT array storage array. By adjusting counters whenever a group or item is added, I always know the number of groups and the number of items in each group. It's ugly and abstruse inside the class. That's why I was looking into other ways to approach it. But I think it must be quite efficient since it's just arrays, and it can be intuitively designed in terms of the interface methods. Quote: > eg: > Fruit: apple - true > orange - false > Grain: bread - false > rice - true > corn - false > etc. > ??? > LFS
|
Wed, 09 May 2012 10:55:16 GMT |
|
|
Page 1 of 2
|
[ 21 post ] |
|
Go to page:
[1]
[2] |
|