Altering one word in a long string? 
Author Message
 Altering one word in a long string?

I am writing an application which reads records from two files, compares
each corresponding word in the two resulting strings, and (according to
a rule) selects one of the two to put into an output file.  Each string
consists of 150 words and all of the 150 must be compared and output.

My first thought was to use them as strings, but I suddenly discovered
that there doesn't seem to be a way to replace a word in a string with
another word.  I then thought of parsing the strings into a compound
variable, but found that when I put a stem after the "with", the entire
string wound up in every element.

Is there a simple way to do what I want or must I manually put each
element into  a compound variable, do what I want to it, then output it?

Thanks,

--Tom

-------------------------------------------------------------------------------
Tom Rubinstein                   "Opinions expressed are not necessarily
Motorola                          those of Motorola"
P.O. Box 85036


Tel:  619 530-8432                Fax:  619 530-8470



Mon, 25 Mar 1996 01:39:17 GMT  
 Altering one word in a long string?

Quote:

> I am writing an application which reads records from two files, compares
> each corresponding word in the two resulting strings, and (according to
> a rule) selects one of the two to put into an output file.  Each string
> consists of 150 words and all of the 150 must be compared and output.

> My first thought was to use them as strings, but I suddenly discovered
> that there doesn't seem to be a way to replace a word in a string with
> another word.  I then thought of parsing the strings into a compound
> variable, but found that when I put a stem after the "with", the entire
> string wound up in every element.

> Is there a simple way to do what I want or must I manually put each
> element into  a compound variable, do what I want to it, then output it?

You don't specify what OS environment you're on, so there may be
alternatives.  Look at the REXX functions WORD(), WORDS(), WORDPOS(),
and SUBWORD().  Assuming the words are space delimited, you can use
these functions to loop through the two strings.  The SUBWORD() function
can be used to break apart and rebuild the string, replacing the desired
word, and all in one assignment statement.  For example, say the
variable cntr equals the word position you wish to replace in one
string.

  str1=subword(str1,1,cntr-1) 'NEWWORD' subword(str1,cntr+1)

Of course, this can be made more robust and bullet proof, but you get
the concept.  Comparison of words between the strings can make use of
these functions too.  If you still need to assign to a stem, they would
be used there too, or you can use a parse command.  A Do cntr=1 to
words(str) loop would be used to.

Contact me directly if you need more help.

Regards,
David McAnally
CSS Applications Engineering
Motorola



Mon, 25 Mar 1996 05:32:20 GMT  
 Altering one word in a long string?

Quote:

>I am writing an application which reads records from two files, compares
>each corresponding word in the two resulting strings, and (according to
>a rule) selects one of the two to put into an output file.  Each string
>consists of 150 words and all of the 150 must be compared and output.

>My first thought was to use them as strings, but I suddenly discovered
>that there doesn't seem to be a way to replace a word in a string with
>another word.  I then thought of parsing the strings into a compound
>variable, but found that when I put a stem after the "with", the entire
>string wound up in every element.

>Is there a simple way to do what I want or must I manually put each
>element into  a compound variable, do what I want to it, then output it?

I have written a WORDTRANSLATE (internal) function, which might be what
you need. It is the equivalent of TRANSLATE for words (well, at least it
should be).

I'll append it below my signature, with a main program which makes it
available as an external function. (The main program is for OS/2, but
may also work in other operating systems. For CMS however, I assume, it
has to be changed because of the data delivered by PARSE SOURCE and the
inavailability of LINEOUT in the non-latest CMS versions; simply replace
"call lineout 'stderr'," with "say" for the latter.) Also I'll append a
help file for it. (The help file is in CMS format.)

'Hope this helps!

Horst
 - - - - - - - - - - - - - - - - - - - - - - -

--- WORDTRANSLATE.CMD --------------------------------------------------
/*                                                                    */
/* Purpose: External subprogram to call the internal one below        */
/*                                                                    */

trace off

parse source . subprogram my_name

if   translate(subprogram) \== 'FUNCTION',
   & translate(subprogram) \== 'SUBROUTINE' then do

   call lineout 'stderr', 'Error:' my_name
   call lineout 'stderr', ' is a REXX external subprogram',
                           'and must not be called in another way.'
   exit 1

end /* then */
else do

   return wordtranslate(arg(1), arg(2), arg(3), arg(4))

end /* else */

/**********************************************************************/

wordtranslate:
   procedure

   /* Argument format: string, <tableo>, tablei<, pad>               */
   /* Function: Translate the words in "string" to others,           */
   /*    much similar to TRANSLATE.                                  */
   /*    "tablei" and "tableo" are the input and output word tables. */
   /*    The pad word "pad" is added to "tablei" once                */
   /*    (to be found by WORDPOS) and to "tableo" as many times      */
   /*    as necessary to make all words in "tablei"                  */
   /*    have an opponent in "tableo".                               */
   /*    All defaults are null strings.                              */
   /*    The result has no leading or traling blanks and only        */
   /*    single blanks between words.                                */

   trace off

   parse arg string, tableo, tablei, pad .

   tablei= tablei pad
   tableo= tableo copies(pad' ', max(0, words(tablei) -words(tableo)))

   myresult= ''
   do i= 1 to words(string)
      sw= word(string, i)
      swf= wordpos(sw, tablei)
      if swf > 0 then do
          myresult= myresult word(tableo, swf)
      end /* then */
      else do
          myresult= myresult sw
      end /* else */
   end /* to */
   myresult= space(myresult)
return myresult
--- WORDTRANSLATE.HLP --------------------------------------------------
.cm Help file: WORDTRANSLATE under REXX
.cm            for the WORDTRANSLATE external REXX function or subroutine
.cm            version 1.0
.cm
.cm Disclaimer: This software is provided as-is, with no warranties
.cm
.cm Please note: Please, report all errors, changes, questions or comments
.cm              to the author
.cm

.cm
.cm Last change: Thursday, 7. October 1993
.fo off

WORDTRANSLATE

(an external function or subroutine)

Its format is:

+----------------------------------------------------------------------+
|                                                                      |
|  WORDTRANSLATE(string, <tableo>, tablei<, pad>)                      |
|                                                                      |
+----------------------------------------------------------------------+

It translates the words in "string"  to others (or reorders the words in
a string -- see below), much similar to TRANSLATE. "tablei" and "tableo"
are the  input and output  word tables. The pad  word "pad" is  added to
"tablei" once (to be found by WORDPOS)  and to "tableo" as many times as
necessary to make all words in "tablei" have an opponent in "tableo".

All defaults are null strings.

The  result has  no leading  or traling  blanks and  only single  blanks
between words.

Here are some examples:

   wordtranslate('a b b c', '&', 'b')                  = 'a & & c'
   wordtranslate('a b c d e f', '1 2', 'e c')          = 'a b 2 d 1 f'
   wordtranslate('a b c d e f', '1 2', 'a b c d', '.') = '1 2 . . e f'
   wordtranslate('4 1 2 3', 'a b c d', '1 2 3 4')      = 'd a b c'

Note:

The last  example shows  how WORDTRANSLATE  may be  used to  reorder the
words in a string. In the  example, any 4-word string could be specified
as the second argument and its last word would be moved to the beginning
of the string.



Mon, 25 Mar 1996 18:54:28 GMT  
 Altering one word in a long string?
At the risk of being the tenth person to answer this...

Quote:

>I am writing an application which reads records from two files, compares
>each corresponding word in the two resulting strings, and (according to
>a rule) selects one of the two to put into an output file.  Each string
>consists of 150 words and all of the 150 must be compared and output.

It sounds to me like something similar to this would do the job:

input1 = linein(file1)           /* or whatever */
input2 = linein(file2)           /* or whatever */
output = ''

do while input1 \= ''
   parse var input1 word1 input1 /* remove one word from input1 */
   parse var input2 word2 input2 /* remove one word from input2 */
   word = rule(word1,word2)      /* Apply whatever rule is necessary */
   output = output word          /* Add the word to the output string */
end

say strip(output,'L')            /* remove an initial space */

Quote:
>My first thought was to use them as strings, but I suddenly discovered
>that there doesn't seem to be a way to replace a word in a string with
>another word.

Some useful function calls for dealing with words are...

words(x)         /* the number of words in x */
subword(x,1,n)   /* the first n words of x */
subword(x,n)     /* the nth and subsequent words of x */
subword(x,m,n)   /* the nth, n+1th, ..., (n+m-1)th words of x */

You can replace the nth word of x with "foo" by saying

x=subword(x,1,n-1) "foo" subword(x,n+1)

Quote:
>               I then thought of parsing the strings into a compound
>variable, but found that when I put a stem after the "with", the entire
>string wound up in every element.

That is correct.  If you say:

parse var x stem.

then all that happens is:

stem.=x

which is not what you want.  You could say:

parse var x stem.1 stem.2 stem.3 stem.4 ... stem.150

but that would be a lot of typing.  There is no way, other than writing
a loop, to split up a string into individual words.

Ian Collier



Mon, 25 Mar 1996 22:29:38 GMT  
 Altering one word in a long string?
Quote:


>> I am writing an application which reads records from two files, compares
>> each corresponding word in the two resulting strings, and (according to
>> a rule) selects one of the two to put into an output file.  Each string
>> consists of 150 words and all of the 150 must be compared and output.

>> My first thought was to use them as strings, but I suddenly discovered
>> that there doesn't seem to be a way to replace a word in a string with
>> another word.  I then thought of parsing the strings into a compound
>> variable, but found that when I put a stem after the "with", the entire
>> string wound up in every element.

>> Is there a simple way to do what I want or must I manually put each
>> element into  a compound variable, do what I want to it, then output it?

  If you have PIPES,  assuming your string is in the variable str,
    'PIPE VAR STR | SPLIT | STEM W.'
 will return the words in stem w.


Tue, 26 Mar 1996 02:25:19 GMT  
 Altering one word in a long string?

Quote:

>I am writing an application which reads records from two files, compares
>each corresponding word in the two resulting strings, and (according to
>a rule) selects one of the two to put into an output file.  Each string
>consists of 150 words and all of the 150 must be compared and output.

>My first thought was to use them as strings, but I suddenly discovered
>that there doesn't seem to be a way to replace a word in a string with
>another word.  I then thought of parsing the strings into a compound
>variable, but found that when I put a stem after the "with", the entire
>string wound up in every element.

>Is there a simple way to do what I want or must I manually put each
>element into  a compound variable, do what I want to it, then output it?

There is a way to replace a word in a string. If W_BEGIN is the index
of the beginning of the word, and W_END is the index of the end, and LINE
contains the string, and NEW_WORD is the word to be substituted,

LINE = substr(line,1,W_BEG-1) || NEW_WORD || substr(line,W_END+1)

will do it.

To parse the string into a compound variable,

STEM. = ""
I = 1
DO WHILE LINE ~= ""
    PARSE VAR LINE STEM.I LINE
    I = I + 1
END

will put the words into STEM.1 ... STEM.150

--


                            Fido 1:163/109.4
    None of what I say is what my employer says, ever!
    Phrenography practiced here - competitive rates!



Tue, 26 Mar 1996 05:06:24 GMT  
 Altering one word in a long string?

Quote:

>I am writing an application which reads records from two files, compares
>each corresponding word in the two resulting strings, and (according to
>a rule) selects one of the two to put into an output file.  Each string
>consists of 150 words and all of the 150 must be compared and output.

Take a look at the word() function. You can use a Do cnt = 1 to 150 and
look at each word in the string. This will also let you build a new
string with either word(String1,i) or word(String2,i). There might be
more effecient ways to perform this. What are you trying to do exactly?

73's  de  Jack - kf5mg





Mon, 25 Mar 1996 21:35:52 GMT  
 Altering one word in a long string?
I want to thank everybody for the overwhelming response to my question.
The program is now working.

--Tom

-------------------------------------------------------------------------------
Tom Rubinstein                   "Opinions expressed are not necessarily
Motorola                          those of Motorola"
P.O. Box 85036



Tel:  619 530-8432                Fax:  619 530-8470



Wed, 27 Mar 1996 06:38:18 GMT  
 Altering one word in a long string?
I know REXX well enough to figure out how to alter one word in a
string, but I have a related question.  Consider the following
statements:

  x = copies('X', 100000)

  y = overlay('Y', x, 50000)

  x = overlay('Z', x, 50000)

To execute the second statement, you must create a copy of the (very
large) string x, overwrite one character in that string, and assign the
resulting string to y.

The third statement is syntactically equivalent to the second, and I am
assuming it would be executed the same way.  But it is extremely
inefficient to make a temporary copy of x, change that copy, and then
reassign the altered copy to x.  What you really want to do here is
write directly onto the string x itself, avoiding the overhead of
making a copy, the way you would in C:

  x[50000] = 'Z';

One possibility, which I think has been discussed here, would be to use
the substr() function as a sort of pseudo-variable:

  substr(x, 50000, 1) = 'Z'

This apparently causes all sorts of syntactical difficulties which I
won't pretend to remember clearly.  But it seems to me that in fact
this sort of thing is not necessary anyway -- that there is no reason
that the REXX interpreter should not be smart enough to realize, at
least some of the time, when a direct write onto a string is
appropriate, and handle the situation accordingly, bypassing the
temporary copy.  So I finally get to my question, which is, is this in
fact the case?  Do some, or all, REXX interpreters in fact handle this
situation and related situations efficiently?  Or is there some reason
why they can't?

It has been a while now since I have done much with REXX (on VM), but I
remember several times having to write programs that made small changes
to large strings, and being worried about efficiency.  If in fact this
is being handled well I think that fact should be better publicized.
--
John Brock
uunet!csfb1!jbrock



Wed, 27 Mar 1996 10:15:03 GMT  
 Altering one word in a long string?
John, the questions you raise about changing the middle of a string
are implementation questions.   The language is silent (I believe)
about whether or not x=overlay(y,x,n) causes creation of a temporary
in all cases or not.  Now what you are saying is that a smart implementer
of overlay might take advantage of possible efficiencies.   And that is
true.  But it is true now.  What kind of change are you suggesting?
Mandated internals?   I don't get it.   Dave

Dave Gomberg, role model for those who don't ask much in their fantasy lives.



Thu, 28 Mar 1996 00:05:31 GMT  
 Altering one word in a long string?

Quote:

>John, the questions you raise about changing the middle of a string
>are implementation questions.   The language is silent (I believe)
>about whether or not x=overlay(y,x,n) causes creation of a temporary
>in all cases or not.  Now what you are saying is that a smart implementer
>of overlay might take advantage of possible efficiencies.   And that is
>true.  But it is true now.  What kind of change are you suggesting?
>Mandated internals?   I don't get it.   Dave

I wasn't suggesting a change, I was asking about implementation.  The
difference between the two possible implementations is rather large,
and I wanted to know how in fact things were done.

Quote:
>Dave Gomberg, role model for those who don't ask much in their fantasy lives.


--
John Brock
uunet!csfb1!jbrock



Thu, 28 Mar 1996 17:06:18 GMT  
 Altering one word in a long string?
John, since we agree you are asking an implementation question, you fail
to specify for which implementation you are asking.  Please advise.

Dave Gomberg, role model for those who don't ask much in their fantasy lives.



Thu, 28 Mar 1996 14:02:20 GMT  
 Altering one word in a long string?

Quote:

>John, since we agree you are asking an implementation question, you fail
>to specify for which implementation you are asking.  Please advise.

Any and all, I guess.  I'm most interested in VM/CMS, because that is
what I use on the rare occasions when I still write REXX code, but I
was also just interested in knowing how most REXX interpreters treated
this particular situation.  Can I count on finding this particular
"optimization" in most REXX interpreters?

Also, since the difference in efficiency between the two possible
implementations I mentioned is so large, it seems that in general it
might be worth while to let the user know somewhere in the
documentation how this is handled.  It could be very reassuring in
certain cases to know that overlaying a single byte on a large string
is a cheap operation, not the expensive one it might appear to be.

The thing is, if the situation is really as easy to recognize as it
seems to me to be, then I would expect that it would be handled
efficiently.  But for all I know there may be logical difficulties I've
overlooked, which force implementors to forgo my "optimization".  Or
maybe they all just got lazy (it happens!).  It would be nice to know
for sure, one way or the other.
--
John Brock
uunet!csfb1!jbrock



Fri, 29 Mar 1996 14:55:48 GMT  
 
 [ 13 post ] 

 Relevant Pages 

1. How to replace one or two words with one word with one line of awk code

2. long lines, long string

3. regexp/regsub operates veeery long on long strings sometimes

4. First X Words in a String can't use [string wordend] nor [lrange]

5. Convert string of words to list of strings

6. MIT scheme: string->list on long strings

7. MySQLdb and strings with quotes / long strings

8. left / right long word roll with MMX?

9. Word expansion patch [LONG]

10. How do I underline one or two words in the text file

11. Securum 1.0 (One of the several latin derivitives for the english word secure)

12. stumbled on this one doing word search

 

 
Powered by phpBB® Forum Software