scanf() type function 
Author Message
 scanf() type function

  I have a string (actually, 82,000 of them) that has 6 integers in it. In C,
I'd use sscanf() to pull the numbers out. Is there a good way to do this
operation in dylan? I'm using d2c if that matters.

  Thanks,

  Mike McDonald



Sun, 16 Dec 2001 03:00:00 GMT  
 scanf() type function


Quote:
>   I have a string (actually, 82,000 of them) that has 6 integers in it. In C,
> I'd use sscanf() to pull the numbers out. Is there a good way to do this
> operation in dylan? I'm using d2c if that matters.

Here's one solution:

------------------- parseString-exports.dylan --------------------
module: dylan-user

define library parseString
  use dylan;
  use format-out;
  use regular-expressions;
  use string-extensions;
  //use streams;
  //use standard-io;
  //use format;
end library;

define module parseString
  use dylan;
  use extensions;
  use format-out;
  use regular-expressions;
  use string-conversions; // from string-extensions
  //use streams;
  //use standard-io;
  //use format;
end module;
---------------------- parseString.dylan -------------------------
module: parseString
synopsis:
author:
copyright:

define method main(appname, #rest arguments)

  let s = "10 23 9999 42 67 78";
  let (a, b, c, d, e, f) = split(" ", s);

  let v = vector(a, b, c, d, e, f);
  let n = map(string-to-integer, v);

  format-out("individual strings are %s, %s, %s, %s, %s, %s\n", a, b, c,
d, e, f);

  for (i from 0 below n.size)
    format-out("n[%d] is %d, one more is %d\n", i, n[i], n[i] + 1);
  end;

  exit(exit-code: 0);
end method main;
------------------------------------------------------------------

One thing I'm confused about:

  Is there an easy way to take the result of split() -- which is produced
by values() -- and turn it directly into a vector/list?  I tried just
doing...

   vector(split(" ", s))

... but it produced a 1-element vector.

-- Bruce



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function


Quote:


>>   I have a string (actually, 82,000 of them) that has 6 integers in it. In C,
>> I'd use sscanf() to pull the numbers out. Is there a good way to do this
>> operation in dylan? I'm using d2c if that matters.

> Here's one solution:

  This solution creates more garbage than mine, which is already a major dog:

define method parse-net(str :: <string>) => (net :: <net>)
  let (num, count, left, bottom, right, top) = split-line(str);
  make(<net>,
       number: string-to-integer(num),
       weight: string-to-integer(count),
       left: string-to-integer(left),
       bottom: string-to-integer(bottom),
       right: string-to-integer(right),
       top: string-to-integer(top));
end;

  Remember, there are 82,000 of these lines!

Quote:
> One thing I'm confused about:

>   Is there an easy way to take the result of split() -- which is produced
> by values() -- and turn it directly into a vector/list?  I tried just
> doing...

>    vector(split(" ", s))

  How about

  let (#rest strings) = split(" ", s);

  In CL, I'd just use values-list.

  Mike McDonald



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

>   How about

>   let (#rest strings) = split(" ", s);

This already looks good. Now just add:

  let integer-values = map(string-to-integer, strings);

and you get a list of the integers. I think it would be even better if
split returned a list, then you could write

  let integer-values = map(string-to-integer, split(" ", s));

which would be quite reasonable.

Now what was the reason that multiple values aren't returned as lists?

Andreas

--
"We show that all proposed quantum bit commitment schemes are insecure because
the sender, Alice, can almost always cheat successfully by using an
Einstein-Podolsky-Rosen type of attack and delaying her measurement until she
opens her commitment." ( http://xxx.lanl.gov/abs/quant-ph/9603004 )



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

> > Here's one solution:

>   This solution creates more garbage than mine, which is already a major dog:

> define method parse-net(str :: <string>) => (net :: <net>)
>   let (num, count, left, bottom, right, top) = split-line(str);
>   make(<net>,
>        number: string-to-integer(num),
>        weight: string-to-integer(count),
>        left: string-to-integer(left),
>        bottom: string-to-integer(bottom),
>        right: string-to-integer(right),
>        top: string-to-integer(top));
> end;

I don't understand how mine is worse, when they are identical!

Quote:
>   How about

>   let (#rest strings) = split(" ", s);

Ah.  Didn't realise you could use #rest in a let list.

-- Bruce



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

> Now what was the reason that multiple values aren't returned as lists?

I assume it's so that the compiler is free to implement it without the
overhead of making and throwing away a list?

-- Bruce



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

> > Now what was the reason that multiple values aren't returned as lists?
> I assume it's so that the compiler is free to implement it without the
> overhead of making and throwing away a list?

Making things easy for the compiler is not exactly a dylanesque
argument. It could optimize away the list in cases where it is not
needed, i.e., generate the list on the caller side if actually needed.

Andreas

--
"We show that all proposed quantum bit commitment schemes are insecure because
the sender, Alice, can almost always cheat successfully by using an
Einstein-Podolsky-Rosen type of attack and delaying her measurement until she
opens her commitment." ( http://xxx.lanl.gov/abs/quant-ph/9603004 )



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

>  I have a string (actually, 82,000 of them) that has 6 integers in it. In
C,
>I'd use sscanf() to pull the numbers out. Is there a good way to do this
>operation in dylan? I'm using d2c if that matters.

I think if you really have 82,000 strings, you would be better off writing
your
own function that did this without using 'split', which in this case exists
simply
to cons lots of intermediate garbage.  'string-to-integer' takes start and
end
arguments; search for the delimiters and pass start and end args.  I'm sure
this will end up being many times faster than using 'split'.

Or you could write 'sscanf' and post it to the community!



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function


Quote:

>>  I have a string (actually, 82,000 of them) that has 6 integers in it. In
> C,
>>I'd use sscanf() to pull the numbers out. Is there a good way to do this
>>operation in dylan? I'm using d2c if that matters.

> I think if you really have 82,000 strings, you would be better off writing
> your
> own function that did this without using 'split', which in this case exists
> simply
> to cons lots of intermediate garbage.  'string-to-integer' takes start and
> end
> arguments; search for the delimiters and pass start and end args.  I'm sure
> this will end up being many times faster than using 'split'.

  The GwydionDylan version of string-to-integer() doesn't take start: and end:
arguements:

define method string-to-integer (string :: <sequence>, #key base = 10)
 => int :: <general-integer>;

  In this particular case, I probably want something like:

define method next-int(string :: <string>, #key start: = 0, end: = #f)
  => (value :: false-or(<integer>), next-index :: <integer>)

Quote:
> Or you could write 'sscanf' and post it to the community!

  I was just hoping someone else had already done it. :-) But since no one
has, maybe I'll do unformat() instead. It always bugged me that sscanf()
ignored any text in the format string. I always wanted it to do a match.

  Mike McDonald



Mon, 17 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:


> > > Now what was the reason that multiple values aren't returned as lists?
> > I assume it's so that the compiler is free to implement it without the
> > overhead of making and throwing away a list?

> Making things easy for the compiler is not exactly a dylanesque
> argument.

   It isn't "easier" for the compiler. It is giving the compiler a choice of
   implementation. Which is quite dyalnesque.

Quote:
> It could optimize away the list in cases where it is not
> needed, i.e., generate the list on the caller side if actually needed.

   I think you're extrapolating the specifics of "split" into the general case.
   Perhaps a better question is why does split return multiple values?
   There is a subtle difference between functions that return mutliple values and
   functions that return an arbitrary amount of "stuff".  In the latter case,
   a collection would seem the better choice, rather than multiple values.

   In my experience, there is usually a pragmatic cap on the number of multiple values.
   So functions that want to return may return a large amount of stuff should return a
   collection. Secondly, an almost univeral use of   #rest to collect all of the values is
   a sure sign that the "return value" really is  a collection.   Also, it is a false  
   economy that you are saving anything if only interested in the first or second element
   and return list is long. A list is very likely created anyway if the number of
   "multiple values" is beyond some very small number.

   IHMO, "let( #rest)" is useful when you don't know how many values are coming back but
   want them all.  For example, someone passes you a function as a first class value, you
   invoke it and don't know how many "significant" answers it returns.

   In the context of two or three return elements those, can be passed in registers.  
   I don't see why

          i. I need to add to my code the overhead of extracting those two elements from
                 a collection I didn't need in the first place.
         ii. Why the compiler should go through gyrations to figure that out.

   We are BOTH doing too much work.  I don't think that is a positive outcome.

   Note that multiple values are also used in place of "call-by-reference".   Instead of
   sending  one (or more ) variables into a function to be modified, you pass them in and
   "rebind" them to new values afterwards.

      let  (  in-out1 , in-out2 )  = fcn ( in1 , in-out1 , in2 , in-out2, in3 )

    I don't see the need to explicitly involve a collection in that process at all.
    (and introducing reference to references has warts of its own).

----

Lyman



Tue, 18 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:
>    I think you're extrapolating the specifics of "split" into the general case.
>    Perhaps a better question is why does split return multiple values?

So that you can write

  let (username, password, uid, #rest the-rest) = split(":", passwd-line);

which resembles an often-seen idiom in perl.  One would have to write

  let (...) = apply(values,split(":", passwd-line));

to get the same effect if split returned a list.  What we really need,
and I'm extrapolating into a general case here, is a function that
turns mutiple values into a list, the opposite of APPLY in the sense
that multiple values are the opposite of multiple arguments.  Mark
Nahabedian told me in private communication that indeed Lisp has such
a function, called MULTIPLE-VALUES-LIST.  One could implement that as
a macro, as in:

define macro multiple-value-list
  {
    multiple-value-list(?:expression)
  }
 =>
  {
    let (#rest returned-values) = ?expression;
    returned-values
  }
end macro multiple-value-list;

I'm searching for a better name, as this doesn't really return a list,
but a vector.  But using that macro one could solve the original
problem (turning a string of numbers into a collection of integers) using:

map(string-to-integer, multiple-values-list(split(" ", my-string)));

What do you think?  Would it make sense to include this macro into the
standard?

Andreas

--
"We show that all proposed quantum bit commitment schemes are insecure because
the sender, Alice, can almost always cheat successfully by using an
Einstein-Podolsky-Rosen type of attack and delaying her measurement until she
opens her commitment." ( http://xxx.lanl.gov/abs/quant-ph/9603004 )



Tue, 18 Dec 2001 03:00:00 GMT  
 scanf() type function


Quote:

>    I think you're extrapolating the specifics of "split" into the general case.
>    Perhaps a better question is why does split return multiple values?

  What's really annoying about the GwydionDylan version of split() is that it
builds a sequence (actually a <deque>) up internally and then does
apply(values, strings); on it. Argh!

  I consider apply(values, ...) to be one of those red flags that maybe I
should stop and thing about what I'm doing. If all of the multiple return
values have the same meaning, then they should be returned in a list/sequence.
This is especially true if there is not a fixed number of them. Just my two
cents.

  Mike McDonald



Tue, 18 Dec 2001 03:00:00 GMT  
 scanf() type function


Quote:


>> define macro multiple-value-list
>>   {
>>     multiple-value-list(?:expression)
>>   }
>>  =>
>>   {
>>     let (#rest returned-values) = ?expression;
>>     returned-values
>>   }
>> end macro multiple-value-list;

>> What do you think?  Would it make sense to include this macro into the
>> standard?

> Yes, such functionality would be handy.  I'm not sure what would be a good
> name for the macro though.

> as-vector(split(" ", my-string))
> multiple-values-to-vector(split(" ", my-string))
> values-to-vector(split(" ", my-string))
> multiple-value-vector(split(" ", my-string))

  But it's not a vector, it's a sequence. So the name "list" is just as valid
as the name "vector".  

  DRM/drm-113.html#MARKER-2-29:

        The last variable may be preceded by #rest, in which case it is bound
     to a sequence containing all the remaining values.

  Mike McDonald



Tue, 18 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

> that multiple values are the opposite of multiple arguments.  Mark
> Nahabedian told me in private communication that indeed Lisp has such
> a function, called MULTIPLE-VALUES-LIST.  One could implement that as

  Common Lisp has a seemingly more apropos macro, DESTRUCTURING-BIND, which
  allows you to bind some values of a list.

   (destructuring-bind ( username password &rest the-rest )  (split ":" passwd-line)
                   .... )

  Unfortunately, I don't have time a the moment to whip up its implementation.  I initial
  thoughts on syntax would be:

    destructuring-bind  ( username , password #rest the-rest ) with  split( ":" , passwd-line)
       ....
    end destructuring-bind;

  or perhaps

    destructuring-bind ( username , password #rest the-rest ) =  split ( ":" ,  passwd-line ) ;

  The expansion in the latter I'm looking for is along the lines of

      let  gensym29 =  split ( ":"  , passwd-line ) ;
      let  username  =  first(gensym29) ;
      let  password  =  second(gensym29) ;
      let  the-rest  =  tail(tail( gensym29) ;

  This slavishly tries to immitate what the CL macro is likely to do.
  I think there is a way to handle up to some fixed N (three or four) arguments relatively
  easily.  And gensym would be extremely handy. :-)

  In the former case (and not hung up on the gensym problem).

     method ( username , password , #rest the-rest ) ... end ( split( ":" , passwd-line )) ;

  In some sense, these are roughly equivalent ways of doing the same thing.

---

Lyman



Tue, 18 Dec 2001 03:00:00 GMT  
 scanf() type function

Quote:

>   In the former case (and not hung up on the gensym problem).

>      method ( username , password , #rest the-rest ) ... end ( split( ":" , passwd-line )) ;

   Drat! just remembered this has same potential problem as multiple values. Usually there
   is a practical upper limit on how many arguments you can send to a function.
   [ which is why this isn't what CL typically macroexpands into either ]

   Likewise, apply( some-fcn ,  make-list(10000 ) )   I'd be suprised if this actually
   worked/portable.   In fact the DRM has a stared footnote on that.

       http://www.harlequin.com/products/ads/dylan/doc/drm/drm_49.htm#FOOTNO...

   It would be consistant for let's #rest to have a similar cap.
   I suppose in the username/password case we're not particularly interested in the fact
   that the-rest may be truncated. However, it isn't a good practice to fall into.
   Someone might think it was an actually correct.

---

Lyman



Tue, 18 Dec 2001 03:00:00 GMT  
 
 [ 25 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Multiple values [was:] Re: scanf() type function

2. scanf function

3. type-checking / type() function

4. How to make a function work with different type and kind type argument

5. ? evaluate functions when total # and types of functions are dynamic

6. scanf("%s", string) or scanf("%s", &string)? Both work, yet...

7. scanf

8. Read in data like scanf

9. No scanf in Clipper!

10. scanf failing(?) for floats read from a file

11. scanf for Ruby, version 1.1.0

12. ANN: scanf for Ruby

 

 
Powered by phpBB® Forum Software