OK, turing.

<- leave blank

Wed Mar 22 21:04:21 EDT 2023

From: Rob Pike <robpike@gmail.com>
Date: Mon, 20 Mar 2023 07:27:34 +1100
To: Ralph Corderoy <ralph@inputplus.co.uk>
Message-ID-Hash: 4CIQUL4LMERBXU76GRUDZRM6KNK7CKIJ
CC: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Archived-At:
<https://www.tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/4CIQUL4LMERBXU76GRUDZRM6KNK7CKIJ/>

As my mail quoted in
https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt says,
Ken worked out a new packing that avoided all the problems with the
existing ones.  He didn't alter Prosser's encoding.  UTF-8, as it was later
called, was not based on anything but it was deeply informed by a couple of
years of work coming to grips with the problem of programming with
multibyte characters.  What Prosser did do, and what we - all of us - are
very grateful for, is start the conversation about replacing UTF with
something practical.

(Speaking of design by committee, the multibyte stuff in C89 was atrocious,
and I heard was done in committee to get someone, perhaps the Japanese, to
sign off.)

Regarding windows, Nathan Myrhvold visited Bell Labs around this time, and
we tried to talk to him about this, but he wasn't interested, claiming they
had it all worked out.  We later learned what he meant, and lamented.  Not
the only time someone wasn't open to hear an idea that might be worth
hearing, but an educational one.

It's important historically to understand how all the forces came together
that day.  The world was ready for a solution to international text, the
proposed character set was acceptable to most but the ASCII compatibility
issues were unbearable, the proposed solution to that was noxious, various
committees were starting to solve the problem in committee, leading to
technical briefs of varying quality, none right, and somehow a phone call
was made one afternoon to a couple of people who had been thinking and
working these issues for ages, one of whom was a genius.  And it all worked
out, which is truly unusual.

-rob




Wed Mar 22 21:04:05 EDT 2023
From: Rob Pike <robpike@gmail.com>
Date: Mon, 20 Mar 2023 20:22:52 +1100
To: arnold@skeeve.com
Message-ID-Hash: EZLOHBOMUAMC342PE5OEL7QFZ6VBDRV6
CC: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Archived-At:
<https://www.tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/EZLOHBOMUAMC342PE5OEL7QFZ6VBDRV6/>

Exactly the way we did it in Plan 9, and published in the paper cited
earlier.  In fact, it's possible the library work was done as early as 1989,
but I'm not sure.  Certainly by 1990.

-rob


On Mon, Mar 20, 2023 at 6:55 PM <arnold@skeeve.com> wrote:

> Hi Rob.
>
> Rob Pike <robpike@gmail.com> wrote:
>
> > (Speaking of design by committee, the multibyte stuff in C89 was
> atrocious,
> > and I heard was done in committee to get someone, perhaps the Japanese,
> to
> > sign off.)
>
> It's not lovely, but I wouldn't call it atrocious.  It gets the job
> done; code using it can handle multibyte encodings while being totally
> character-set agnostic.  I speak from experience, gawk does this.
> (I use the "restartable" routins - mbrlen() and so on.)
>
> I understand that Unicode + UTF-8 solve the issue completely.  But I'd
> like to ask, in all seriousness and so that I can learn, given the world
> as it was in 1989, how would you solve the problem?  If you had designed
> the C level routines, what would they have looked like?
>
> Thanks,
>
> Arnold
>




Wed Mar 22 21:03:28 EDT 2023
From: Rob Pike <robpike@gmail.com>
Date: Wed, 22 Mar 2023 23:02:33 +1100
To: Skip Tavakkolian <fariborz.t@gmail.com>
Message-ID-Hash: TQWQ4V4ZNSW4PDXPM3HHXZGAYYNNPTV2
CC: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Archived-At:
<https://www.tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/TQWQ4V4ZNSW4PDXPM3HHXZGAYYNNPTV2/>

The appendix version named it plain UTF, repurposing the extant name to the
new encoding.  The -8 came later, as it is in these linked documents,
because some people wanted a UTF-7 and a UTF-16.  Those people should be
punished.

-rob



On Wed, Mar 22, 2023 at 9:09 PM Skip Tavakkolian <fariborz.t@gmail.com>
wrote:

> Also here:
> https://github.com/0intro/plan9/tree/main/sys/doc
>
>
> On Wed, Mar 22, 2023, 3:02 AM Skip Tavakkolian <fariborz.t@gmail.com>
> wrote:
>
>> http://p9f.org/sys/doc/utf.ps
>>
>> On Wed, Mar 22, 2023, 12:41 AM <arnold@skeeve.com> wrote:
>>
>>> Thanks.  Is there a link to postscript or pdf of the paper?  I
undoubtedly
>>> read it decades ago, but I doubt that I have it handy.
>>>
>>> Thanks,
>>>
>>> Arnold
>>>
>>> Rob Pike <robpike@gmail.com> wrote:
>>>
>>> > Pretty much, as it was the Plan 9 UTF man page at the time.
This link
>>> will
>>> > be essentially the same.
>>> >
>>> > http://man.cat-v.org/plan_9/6/utf
>>> >
>>> > -rob
>>> >
>>> >
>>> > On Wed, Mar 22, 2023 at 6:12 PM Mehdi Sadeghi
<mehdi@mehdix.org>
>>> wrote:
>>> >
>>> > > It's a long shot but is that appendix around by any chance?
>>> > >
>>> > >
>>> > > Mehdi
>>> > >
>>> > >
>>> > > On 3/22/23 03:52, Rob Pike wrote:
>>> > >
>>> > > the paper had an appendix that described UTF-8's encoding
>>> rigorously, but
>>> > > that was dropped
>>> > >
>>> > >
>>>
>>




Wed Mar 22 21:03:23 EDT 2023
From tuhs.org!tuhs-bounces Tue Mar 21 22:52:38 -0400 2023
From: Rob Pike <robpike@gmail.com>
Date: Wed, 22 Mar 2023 13:52:16 +1100
To: Larry McVoy <lm@mcvoy.com>
Message-ID-Hash: NI6WPNPUYHGUZ7BHZBZBAIW424VTVTGZ
CC: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Archived-At:
<https://www.tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/NI6WPNPUYHGUZ7BHZBZBAIW424VTVTGZ/>

Thanks for your support but C89 didn't specify an encoding.  In classic
committee fashion, it refused to take a stand about anything that might
limit adoption.  The problem was that the API it offered was clumsy and made
encoding errors hard to ignore.  (Grepping a file for a string, do you
really care if there is an irrelevant binary blob in the middle that isn't
kosher UTF-8?) Also, it provided no support for printing "wide" characters.
This is all covered in the paper cited above.*

The original UTF was compatible with ASCII but not robust if there was an
alignment problem, and also used printable ASCII characters in multibyte
sequences.  You could find a '/' inside a Cyrillic character encoding, which
broke Unix badly.  That's why FSS-UTF, File-safe UTF, was the name given to
Prosser's variant.

It's wrong to give us credit for properties we didn't introduce.  But UTF-8
is more regular, simpler to encode and decode, and more robust than its
predecessors.  Most important, it did introduce the self-synchronization
property, which was the key that opened the door for us at X-Open.

-rob

* In a classic Usenix whoops, the paper had an appendix that described
UTF-8's encoding rigorously, but that was dropped when it was published in
the conference proceedings.  Perhaps that's why the RFC got in the mix and
started some of the confusion about its origin.


On Wed, Mar 22, 2023 at 1:25 PM Larry McVoy <lm@mcvoy.com> wrote:

> The brilliance of UTF-8 was to encode ASCII as is.  That seems obvious in
> retrospect but as Rob says, the multibyte crud in C89 was just awful,
> and that was the answer at the time.  Fitting ASCII in as is meant
> that all of the Unix utilities, sed, grep, awk, etc, had close to no
> performance hit if you were processing ascii.  That's pretty cool when
> you get that and you can process Japanese et al as well.
>
> I kind of cringe when I say it is brilliant to not break what exists
> already, to me, that's just part of what you do as an engineer.  But
> history has shown that not breaking stuff, fitting the new into the
> old, is brilliant.  So kudos to Rob and Ken for doing that (but truth
> be told, I'd be stunned if they didn't, they are great engineers).
>
> On Mon, Mar 20, 2023 at 07:27:34AM +1100, Rob Pike wrote:
> > As my mail quoted in
> > https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt says,
> > Ken worked out a new packing that avoided all the problems with the
> > existing ones.  He didn't alter Prosser's encoding.  UTF-8, as it was
later
> > called, was not based on anything but it was deeply informed by a couple
> of
> > years of work coming to grips with the problem of programming with
> > multibyte characters.  What Prosser did do, and what we - all of us -
are
> > very grateful for, is start the conversation about replacing UTF with
> > something practical.
> >
> > (Speaking of design by committee, the multibyte stuff in C89 was
> atrocious,
> > and I heard was done in committee to get someone, perhaps the Japanese,
> to
> > sign off.)
> >
> > Regarding windows, Nathan Myrhvold visited Bell Labs around this time,
> and
> > we tried to talk to him about this, but he wasn't interested, claiming
> they
> > had it all worked out.  We later learned what he meant, and lamented.
Not
> > the only time someone wasn't open to hear an idea that might be worth
> > hearing, but an educational one.
> >
> > It's important historically to understand how all the forces came
> together
> > that day.  The world was ready for a solution to international text, the
> > proposed character set was acceptable to most but the ASCII
compatibility
> > issues were unbearable, the proposed solution to that was noxious,
> various
> > committees were starting to solve the problem in committee, leading to
> > technical briefs of varying quality, none right, and somehow a phone
call
> > was made one afternoon to a couple of people who had been thinking and
> > working these issues for ages, one of whom was a genius.  And it all
> worked
> > out, which is truly unusual.
> >
> > -rob
>
> --
> ---
> Larry McVoy Retired to fishing
> http://www.mcvoy.com/lm/boat
>




Wed Mar 22 18:29:41 EDT 2023
With this simple and to-the-point method, you'll be making money faster than you
can say 'abracadabra'.
==> https://bit.ly/3ldkMjy
Much Love,
Maddison Gaffney

Wed Mar 22 18:22:35 EDT 2023
#include <stdio.h>
#include <stdlib.h>

enum Token {
	LINVALID,
	LEND,
	LRIGHT,
	LLEFT,
	LINC,
	LDEC,
	LOUT,
	LIN,
	LLOOP,
	LENDL,
};

enum Symbol {
	INVALID,
	END,
	RIGHT,
	LEFT,
	INC,
	DEC,
	OUT,
	IN,
	LOOP,
	ENDL,
};

typedef struct Sym Sym;

struct Sym {
	int inst;
	Sym *loop;
	Sym *next;
};

typedef struct Bf {
	int *i; // instructions
	size_t in; // instructions number
	size_t is; // instructions size
	unsigned char *d; // data
	int *ip; // instruction pointer
	unsigned char *dp; // data pointer
} Bf;

void *
emalloc(size_t s)
{
	void *p;

	p = malloc(s);
	if(!p){
		fprintf(stderr, "Error: Out of memory.\n");
		exit(EXIT_FAILURE);
	}
	return p;
}

void *
erealloc(void *p, size_t s)
{
	p = realloc(p, s);
	if(!p){
		fprintf(stderr, "Error: Out of memory.\n");
		exit(EXIT_FAILURE);
	}
	return p;
}

int
lex(FILE *f, size_t *l, size_t *b)
{
	int c;

	while((c = fgetc(f)) != EOF){
		switch(c){
		default:
			fprintf(stderr, "Error: Invalid character at line %zu
			column %zu: %c\n", *l, *b, c);
			exit(EXIT_FAILURE);
		case '>': return LRIGHT;
		case '<': return LLEFT;
		case '+': return LINC;
		case '-': return LDEC;
		case '.': return LOUT;
		case ',': return LIN;
		case '[': return LLOOP;
		case ']': return LENDL;
		case '\n':
			*l += 1;
			*b = 0;
			break;
		}
	}
	if(ferror(f)){
		fprintf(stderr, "Error: I/O error when reading file\n");
		exit(EXIT_FAILURE);
	}
	return LEND;
}

Sym *
parse(FILE *f)
{
	int t;
	size_t l, b;
	Sym **stk;
	size_t stkn, stks;
	Sym *h;
	Sym *n;
	Sym **p;

	l = 1;
	b = 1;

	stkn = 0;
	stks = sizeof(*stk);
	stk = emalloc(stks);

	for(p = &h; (t = lex(f, &l, &b)) != LEND;){
		*p = emalloc(sizeof(**p));
		n = *p;
		n->inst = INVALID;
		n->loop = 0;
		n->next = 0;
		switch(t){
		default:
			fprintf(stderr, "Error: Invalid token at line %zu column
			%zu: %d\n", l, b, t);
			exit(EXIT_FAILURE);
		case LRIGHT:
			n->inst = RIGHT;
			p = &n->next;
			break;
		case LLEFT:
			n->inst = LEFT;
			p = &n->next;
			break;
		case LINC:
			n->inst = LINC;
			p = &n->next;
			break;
		case LDEC:
			n->inst = LDEC;
			p = &n->next;
			break;
		case LOUT:
			n->inst = OUT;
			p = &n->next;
			break;
		case LIN:
			n->inst = IN;
			p = &n->next;
			break;
		case LLOOP:
			n->inst = LOOP;
			p = &n->loop;
			if(stkn >= stks){
				stk = erealloc(stk, 2 * stks);
			}
			stk[stkn++] = n;
			break;
		case LENDL:
			n->inst = ENDL;
			if(stkn <= 0){
				fprintf(stderr, "Error: Too many loop endings at
				line %zu column %zu.\n", l, b);
				exit(EXIT_FAILURE);
			}
			n = stk[--stkn];
			p = &n->next;
			break;
		}
	}
	if(stkn > 0){
		fprintf(stderr, "Error: Not enough loop endings.\n");
		exit(EXIT_FAILURE);
	}
	free(stk);

	return h;
}

void
emit()
{
}

void
run()
{
}

int
main(void)
{
	Sym *s;

	s = parse(stdin);
	printf("lol\n");
	return EXIT_SUCCESS;
}


Wed Mar 22 18:06:06 EDT 2023
If you're a business owner with at least two W2 employees...  then THIS is exactly
what you're looking for.

I'd like to introduce you to apply-for-ertc.com

which helps you apply for the Cares Act Program in just a few minutes!

apply-for-ertc.com makes it simple for you to:

> get free stimulus money you don't have to pay back...

> get the last 3 years of stimulus payments all at once...

> get quick approval...

> get up to 26 thousand per employee...

And what makes this even better?

Now you never have to worry about doing any of the number crunching or figuring
out how much money you qualify for!

Which also means you're not stuck feeling like this is too much paperwork to sort
out.

And best of all...  you'll be done with apply-for-ertc.com in less than 2 minutes
and it costs nothing to apply.

So again, if you're a business owner with at least two W2 employees, understand
this:

> this program will end soon...

> If you act now before April you can still claim the prior 3 years...

Go to apply-for-ertc.com







































































Thank you for your interest in our communications.
We understand that everyone's preferences are different
and we respect your decision to opt out of our messages.
I may receive a commission please note that I only recommend
products or services that I believe are of high quality.
836 Southampton Road Step B# 146 Benicia Ca 94510

You can unsubscribe from our promotional offers with the link below.
https://www.rewardocity.com/?info=okturing.com


Wed Mar 22 17:44:48 EDT 2023
Hello,

Do you struggle with creating high-quality videos that capture your audience's
attention?

Are you frustrated with the complicated and time-consuming process of video
editing?

We understand your pain, and that's why we created Pictory – the AI-powered video
editing tool that simplifies the process for you.

Say goodbye to the hassle of video editing and hello to effortless,
professional-looking videos that elevate your brand.

Click here to see how Pictory can revolutionize your video marketing game.

simplevideobot.com


Regards,

Glenda









































224 Westwood Cir., Dalton, GA 30721
Unsubscribe:
optoutforever.com/?site=okturing.com


Wed Mar 22 10:20:19 EDT 2023
But if enough people see beyond the veil, it’s not a veil anymore - it’s just
lingerie the powerful wear when fucking us.

Wed Mar 22 10:09:20 EDT 2023
yes, a lack of working capital is holding me back.
no, i'm not clicking that.
thanks.
bye.


Tue Mar 21 23:40:55 EDT 2023
From tuhs.org!tuhs-bounces Tue Mar 21 22:52:38 -0400 2023
From: Rob Pike <robpike@gmail.com>
Date: Wed, 22 Mar 2023 13:52:16 +1100
To: Larry McVoy <lm@mcvoy.com>
Message-ID-Hash: NI6WPNPUYHGUZ7BHZBZBAIW424VTVTGZ
CC: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Archived-At:
<https://www.tuhs.org/mailman3/hyperkitty/list/tuhs@tuhs.org/message/NI6WPNPUYHGUZ7BHZBZBAIW424VTVTGZ/>

Thanks for your support but C89 didn't specify an encoding.  In classic
committee fashion, it refused to take a stand about anything that might
limit adoption.  The problem was that the API it offered was clumsy and made
encoding errors hard to ignore.  (Grepping a file for a string, do you
really care if there is an irrelevant binary blob in the middle that isn't
kosher UTF-8?) Also, it provided no support for printing "wide" characters.
This is all covered in the paper cited above.*

The original UTF was compatible with ASCII but not robust if there was an
alignment problem, and also used printable ASCII characters in multibyte
sequences.  You could find a '/' inside a Cyrillic character encoding, which
broke Unix badly.  That's why FSS-UTF, File-safe UTF, was the name given to
Prosser's variant.

It's wrong to give us credit for properties we didn't introduce.  But UTF-8
is more regular, simpler to encode and decode, and more robust than its
predecessors.  Most important, it did introduce the self-synchronization
property, which was the key that opened the door for us at X-Open.

-rob

* In a classic Usenix whoops, the paper had an appendix that described
UTF-8's encoding rigorously, but that was dropped when it was published in
the conference proceedings.  Perhaps that's why the RFC got in the mix and
started some of the confusion about its origin.


On Wed, Mar 22, 2023 at 1:25 PM Larry McVoy <lm@mcvoy.com> wrote:

> The brilliance of UTF-8 was to encode ASCII as is.  That seems obvious in
> retrospect but as Rob says, the multibyte crud in C89 was just awful,
> and that was the answer at the time.  Fitting ASCII in as is meant
> that all of the Unix utilities, sed, grep, awk, etc, had close to no
> performance hit if you were processing ascii.  That's pretty cool when
> you get that and you can process Japanese et al as well.
>
> I kind of cringe when I say it is brilliant to not break what exists
> already, to me, that's just part of what you do as an engineer.  But
> history has shown that not breaking stuff, fitting the new into the
> old, is brilliant.  So kudos to Rob and Ken for doing that (but truth
> be told, I'd be stunned if they didn't, they are great engineers).
>
> On Mon, Mar 20, 2023 at 07:27:34AM +1100, Rob Pike wrote:
> > As my mail quoted in
> > https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt says,
> > Ken worked out a new packing that avoided all the problems with the
> > existing ones.  He didn't alter Prosser's encoding.  UTF-8, as it was
later
> > called, was not based on anything but it was deeply informed by a couple
> of
> > years of work coming to grips with the problem of programming with
> > multibyte characters.  What Prosser did do, and what we - all of us -
are
> > very grateful for, is start the conversation about replacing UTF with
> > something practical.
> >
> > (Speaking of design by committee, the multibyte stuff in C89 was
> atrocious,
> > and I heard was done in committee to get someone, perhaps the Japanese,
> to
> > sign off.)
> >
> > Regarding windows, Nathan Myrhvold visited Bell Labs around this time,
> and
> > we tried to talk to him about this, but he wasn't interested, claiming
> they
> > had it all worked out.  We later learned what he meant, and lamented.
Not
> > the only time someone wasn't open to hear an idea that might be worth
> > hearing, but an educational one.
> >
> > It's important historically to understand how all the forces came
> together
> > that day.  The world was ready for a solution to international text, the
> > proposed character set was acceptable to most but the ASCII
compatibility
> > issues were unbearable, the proposed solution to that was noxious,
> various
> > committees were starting to solve the problem in committee, leading to
> > technical briefs of varying quality, none right, and somehow a phone
call
> > was made one afternoon to a couple of people who had been thinking and
> > working these issues for ages, one of whom was a genius.  And it all
> worked
> > out, which is truly unusual.
> >
> > -rob
>
> --
> ---
> Larry McVoy Retired to fishing
> http://www.mcvoy.com/lm/boat
>




prev | next