-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathTODO
1193 lines (1026 loc) · 56.5 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Foundations of Data Science
http://nuit-blanche.blogspot.ca/2014/10/book-foundations-of-data-science-by.html
http://www.cs.cornell.edu/jeh/book11April2014.pdf
Understand generalized linear regression/models
PL/logic research
isabelle meta inference rules
http://books.google.ca/books?id=RxlhqG0-cGwC&pg=PA105&lpg=PA105&dq=isabelle+meta+%22inference+rules%22&source=bl&ots=zWNHZ1iChr&sig=7nXyiB4A5jwzjGeRYjEPbfE4BC0&hl=en&sa=X&ei=FkddUrCpDuXEyQGh5YHQDA&ved=0CHAQ6AEwBw#v=onepage&q=isabelle%20meta%20%22inference%20rules%22&f=false
partial functions, tail recursive, soundness, j moore, dave greve
http://www.seas.upenn.edu/~plclub/poplmark/
http://iron.ouroborus.net/
https://www.cs.uoregon.edu/research/summerschool/summer14/curriculum.html
research related to computation, mathematics, mind?
ftp://ftp.cs.ru.nl/pub/CompMath.Found
OS dev
http://wiki.osdev.org/Tutorials
http://www.coranac.com/tonc/text/asm.htm
scalable universe: entanglement
also, how do we seamlessly transition between processing regions?
some sort of hierarchical annealing
allow tiling of regions
more efficient macroscopic evolution of sectors
remember "observed" region states and modifications
to reclaim efficiency, gradually eliminate modifications over time, explainable via weathering, entropy, etc.
how quickly can a new world/region/sector be generated? and then evolved to be interesting?
minimize the amount pre-generated, that would otherwise sit around wasting resources
================================================================
wanted: size inference (and polymorphism) to calculate totality information
and distinguish inductive and coinductive situations
polymorphism over inductive/coinductive uses
the same function may have different required evaluation orders, depending
the way to avoid non-termination is to never evaluate under neutral exprs
unless new dynamic bindings are satisfied
evaluation only continues with exprs whose head has a dynamic binding in
a neutral position
let dv stand for a dynamic var
dv is in a neutral position at the head of the following exprs:
dv ## the lone var
(dv expr)
(case dv of ...)
(\... -> dv)
(type-of dv)
but not in the following:
(expr dv)
but the 'dv' sub-expr itself is neutral, and it alone is evaluated
btw, 'type-of' only operates dynamically, and so preserves parametricity
it won't produce more than a skolem constant for any term whose type ends
up as a quantified type var
allow re-entering even under lambdas upon instantiation of a neutral dynamic
binding, but this is only once at each such site, not repeatedly during
recursive calls; repetition only happens when evaluation naturally reaches
such a point; this way, totality should ensure termination
dynamic application to monomorphic vars force monomorphic type resolution of
that dynamic lambda, but not necessarily everything underneath
monomorphic application to dynamic vars produce new monomorphic constraints
for such vars, similar to intersection types
if such a dynamic var is only used at monomorphic positions, and all of its
generated constraints can be unified, it need not remain dynamic and is
monomorphised, ie:
toList : forall a. a -> [a]
(\f* x* -> f (f x)) toList
\x* -> toList (toList x) ## x only used monomorphically, drop the star
\x -> toList (toList x) ## full type : forall a. a -> [[a]]
now the original dynamic lambda should show the following type
(only at that position, other exprs with this lambda are unaffected, even
if coming from the same name/definition):
((\f* x* -> f (f x)) : forall a. (forall b. b -> [b]) -> a -> [[a]]) toList
in this case, you get something similar to what the ML^F-related family of
type systems gives, even without annotations for lambdas not defined inline
with its application (though you still need the dynamic binder annotation)
actually, an even simpler model
monomorphic application to dynamic vars still produces new monomorphic
constraints, (together forming an intersection type)
simple vs. dynamic binding doesn't otherwise matter for any application, and
evaluation always proceeds as usual up to neutral terms
at any point that there are dynamic lambdas that cannot be erased, the vars
they bind are monomorphised
unless some unsafe/dynamic flag is enabled yada yada
type vars that have not yet been unified nor quantified are considered
dynamic entry points for the purpose of determining still-reachable dynamic
lambdas
if a type var ends up unifying with a concrete type, that concrete type
supplies an abstract value that may satisfy continued dynamic evaluation
the above example still turns out the same way after such monomorphing
as mentioned above, unifying type variables to a head-concrete form (a type
constructor's args may still be vars) is also an event triggering continued
dynamic evaluation (so is quantifying a type var? it becomes skolemized?)
what is the parallel between dynamic and type vars?
dynamic evaluation operates in the type checking phase, so in this way,
dynamic vars and type vars are similar, though they differ in that type vars
unify whereas dynamic vars are substituted during reduction
how do value ranges get tracked?
in some way similar to polymorphic variants?
if it's possible to access a record whose key range is incomplete
forces key expr to have dynamic value? upward propagation?
maybe it's better to just end up with a type error: no variant will match
otherwise, value can be monomorphic since the full range is accessible
should the same happen if access returns non-unifiable alternatives?
or maybe it could be due to an alternative containing a dynamic expr
but it's not just dispatching on a value's type that causes this
incomplete dispatch can also cause it
or complete dispatch leading to a dynamic expr
final thoughts here: don't do anything special for now, do something similar
to polymorphic variant unification; programmer can mark vars dynamic in
response to these type errors
key select vs. variant dispatch
records with full key ranges allow for monomorphic key vars
records with partial ranges need dynamic key vars or something similar to
variants (unless a constraint solving system is developed)
constraint solving would propagate assertions upward until they are either
found to have been calculated, or concrete values are found from which the
asserted property can be calculated
examples:
0 <= x < 10
length y > 0
z : List a {<Cons=(a, List a)>} ## as opposed to a full 'List a'
Sorted xs
================================================================
assertions?
enforce satisfaction of arbitrary properties by attaching proof obligations to
definitions in the form of predicates
how much automatic solving can be achieved?
how are manual solutions specified?
================================================================
two different sorts of super polymorphic bindings could be distinguished
(not sure if this is necessary to make things work)
polymorphic vs. dynamic
polymorphic is mostly concerned with types depending on types
if a polymorphic arg is itself polymorphic/dynamic, it's inlined
dynamic is concerned with types depending on values
it's an error to provide a simple variable as a dynamic arg
alternatively, if these concepts are kept together, things would be simpler
(preferred approach)
providing simple vars as poly/dynamic args forces simple typing
in either case
recursive dynamic applications should be memoized to avoid infinite unrolling
repeated application to the same args leaves evaluation incomplete
this should be temporary until more dynamic args are specified
it's an error for evaluation to be incomplete when all dynamic args have
been specified; is this only possible by passing simple vars?
give 'fix' binders their own form of polymorphism to support poly recursion
if a type annotation is given, unification proceeds as usual
during type inference for the body, a fix binder is bottom: forall a. a
all unification attempts with the binder are saved as constraints for later
inference obtains a candidate type (may still be an incomplete poly type)
once all (if any) required poly bindings are provided
the full type should be generalized and used to solve the constraints
solving the constraints is just for discharging proofs
it should not affect the resulting type
internal (recursive) completions of a fix binder are resolved independently
this will probably discharge proofs before the parent is completed
in fact, i think that in most poly type cases all proofs will be
discharged before parent completion
================================================================
it turns out that super polymorphism can also be used for dependent types
case analysis desugars into record access
record access typing involves unifying all possible result types
possible result types are refined if the access depends on a super
polymorphic value; that value must be present to proceed with typing, and
since the value is present, only a single case alternative is considered
for value dependencies that aren't directly referenced in resulting types,
unification constraints are generated
satisfaction proofs must be discharged when values fulfill the dependencies
done automatically, as shown in the polymorphic recursion example below
it's possible that annotations will sometimes be needed to fully discharge
is it possible that all fixpoint bindings can be super polymorphic even in
settings where they could be monomorphic? i'm thinking yes at the moment
notation for mutually embedding terms and values (types in these examples):
>|t|< indicates a neutral term (has type: Value)
<|v|> indicates a literal value (has type: Term)
examples (some of which are pretty hard to read):
================================
a polymorphic recursion example, adapted from
(https://en.wikipedia.org/wiki/Polymorphic_recursion)
haskell requires an annotation to type this example
super polymorphism can type this example without annotation:
data Nested a = a :<: (Nested [a]) | Epsilon
infixr 5 :<:
## length itself is made super polymorphic in this case
length Epsilon = 0
length (_ :<: xs) = 1 + length xs
desugars to (leaving case analysis sugared for simplicity):
fix length* = \nested -> case nested of
Epsilon -> 0
(_ :<: xs) = 1 + length xs
typing will begin as follows:
forall a.
>| fix length* = <|Nested a -> >|case <|Nested a|> of
Epsilon -> <|Int|>
(_ :<: xs) = <|Int|> + length <|Nested [a]|>
|< |>
|<
case analysis does not depend on a super polymorphic value,
so alternatives are unified:
Int UNIFY >|<|Int|> + length <|Nested [a]|>|<
...simplifying...
Int UNIFY >|length <|Nested [a]|>|<
which remains as a proof to be discharged
however it can only be unified to Int, giving us length's return type
so, we have a generalized type
length : forall a. Nested a -> Int
with a proof to discharge
Int UNIFY >|length <|Nested [a]|>|<
maybe noted more like:
length : forall a. (Int UNIFY >|length <|Nested [a]|>|<) => Nested a -> Int
since we already have the value depended on (length itself) we can proceed
assuming the conclusion (I think this should be valid given termination)
Int UNIFY >|<|forall t. (Nested t -> Int)|> <|Nested [a]|>|<
...
instantiate, fresh b:
Int UNIFY >|<|(Nested b -> Int)|> <|Nested [a]|>|<
...
Nested b UNIFY Nested [a]
Int UNIFY Int
...
b UNIFY [a]
Int UNIFY Int
...
b = [a]:
Int UNIFY Int
QED
================================
## where apply and its second arg are super polymorphic
apply f [] = f
apply f [first, rest] = apply (f first) rest
desugars to:
fix apply* =
\f args* ->
({[] = \[] -> f,
[_, _] = \[first, rest*] -> apply (f first) rest
} @ (type-of args)
) args
for instance, if
func : A -> B
then
apply func : >|\args* ->
({[] = \[] -> <|A -> B|>,
[A, _] = \[<|A|>, rest*] -> apply <|B|> rest
} @ (type-of args)
) args|<
which basically means that giving f the type A -> B constrains args to be
args : [] | [A, _]
where | means 'or'
constraining in the other direction also works; specifying part of the
structure of 'args' will demand a certain type from 'f'
\func more* -> apply func [A, [B, more]] :
(A -> B -> R) -> >|\more* -> apply <|R|> more|<
================================
## format' and the first arg of format and format' is super polymorphic
format str = format' str ""
where format' ("%s" ++ rest) acc = \str -> format' rest (acc ++ str)
format' ("%%" ++ rest) acc = format' rest (acc ++ "%")
format' (ch :: rest) acc = format' rest (acc ++ [ch])
format' "" acc = acc
desugars to (leaving case analysis sugared for simplicity):
format =
\str* -> (fix format'* =
\str* acc -> case str of
("%s" ++ rest*) -> \str -> format' rest (acc ++ str)
("%%" ++ rest*) -> format' rest (acc ++ "%")
(ch :: rest*) -> format' rest (acc ++ [ch])
"" -> acc
) str ""
for instance:
\more* -> format ("this: %s, and " ++ more)
: >| \more* -> <| String -> >| (fix format'* =
\str* -> <| String -> >| case str of
("%s" ++ rest*) -> <| String -> >| format' rest <|String|> |< |>
("%%" ++ rest*) -> format' rest <|String|>
(ch :: rest*) -> format' rest <|String|>
"" -> <|String|> |< |>
) more <|String|> |< |> |<
================================================================
Flesh these examples out in terms of how type inference works:
================================
This one should end up being like: ((a -> b) & (c -> d)) -> a -> c -> [b, d]
\f* x y -> [f x, f y]
================================
polytwo is like: ((a -> b) & (b -> c)) -> a -> c
(polytwo polytwo) should be like a polyfour:
((a -> b) & (b -> c) & (c -> d) & (d -> e)) -> a -> e
let polytwo = \f* x* -> f (f x)
in polytwo polytwo
while both two and (two two) are simply: (a -> a) -> a -> a
let two = \f x -> f (f x)
in two two
what about partially-polymorphic (pptwo pptwo)? degenerates to (two two)?
let pptwo = \f* x -> f (f x)
in pptwo pptwo
================================
consider this to be apply f args*:
(note that empty tuple [] and pair [x, y] have different types)
apply f [] = f
apply f [first, rest] = apply (f first) rest
================================
I'm not sure how to indicate the polymorphism here without dependent types:
format str = format' str ""
where format' ("%s" ++ rest) acc = \str -> format' rest (acc ++ str)
format' ("%%" ++ rest) acc = format' rest (acc ++ "%")
format' (ch :: rest) acc = format' rest (acc ++ [ch])
format' "" acc = acc
=======================================================================================
current feeling: assuming it always subsumes the inductive analysis described
below, CFA may give effective solutions to recursive shapes in polymorphic
type continuations (which makes sense since safety checks based on 0CFA behave
like recursive types with subtyping); is this the best/simplest approach?
flesh out polymorphic resumable typing continuations; given the below, it
seems these need to support type shapes that are potentially recursive
how does recursion work in the polymorphic case? how does letrec bind names?
first attempt at [polymorphic] binding shapes: how to handle polymorphic return
types? polymorphic return types whose only poly-dependencies are (recursively)
themselves become monomorphic? seems to not always hold if not tail recursive
consider two examples:
ending up monomorphic
func must return Num here, and so must its base case (if there
is no base case, it's Void): return (func a b + c)
how to resolve recursive type if base case is Int? say:
R = 0:Int | (Num a => (R + Num a))
how about looking at it inductively: base case is 0:Int, one step up from
that is (Num a => (0:Int + Num a)) which becomes Int
actually polymorphic
(how deep is the tower of tuples?): return [func a b, c]
if base case above is returning an empty tuple, this has recursive type:
R = [] | [R, c]
looking at this one inductively:
base case is [], one step up is [[], c], which yields no simplification
throw in pi/sigma types or plain records/variants, and what happens?
do typeclasses/implicits still fit?
still must be able to run code dynamically if type checking is incomplete
=======================================================================================
symbol (1D) sets: ordered/unordered, discrete, ordered ranges (one or two endpoints), infinite
examples: ints (ordered infinite), nats (ordered range with 0 as minimum); what would an unordered infinite set be like?
ordered ranges may support modular arithmetic (wrapping/circular)
subtractive sets? maybe these fall under the below predicate-based sets
sets may be built with predicates over other sets
rows map a set of symbols to data
records and variants built on rows; tuples can be thought of as records with contiguous nat fields (joining and splitting shifts the nats though?)
should sets simply be records mapping the contained symbols to the unit value? what does it mean to just grab the keys or values of a record?
should records/variants support default values? the value a key would map to if the key is not already represented in the row
the danger with defaults is if you don't have a nice way to determine the difference between a row-represented key and a defaulted key
support undefined as a default value; maybe assigning a key to undefined implies deletion/hiding of the key?
support an if-else-like index/access primitive form that can determine if the index/access would be undefined/out-of-bounds
absolute index/access can be implemented in terms of it, such that the 'else' branch produces an undefined value itself
is it possible to instead support a primitive that will catch *any* undefined value?
in the scope of such a catcher, how do you avoid masking unintended errors?
can all possibly-undefined operations be detected and replaced with safety-checking code, such that failure triggers the catcher?
are bags/multisets more fundamental than sets? though they generalize sets, it seems hard to create a foundation with them
=======================================================================================
term : type
subtype <: type
===============
Set <: Type ? maybe just say Type for everything instead of distinguishing
Type : Type ? how to distinguish universes
nat : Nat : SymbolSet <: Set
int : Int : SymbolSet <: Set
... bounded and modular versions of the above ...
symbol : this symbol's primary/home set of symbols, constructed by Symbol(Nat (name/namespace), Nat (cardinality)) : SymbolSet <: Set
--tuple : set of values structurally described by a particular tuple of Sets of correct length : tuple struct-described powerset (aka TupleSet) <: Set
--array : set struct-described by Array(Type, nat) : ArraySet <: Set
record : set struct-described by Record(SymbolSet (of keys), particular record mapping fields to Types) : RecordSet <: Set
--variant : set struct-described by Variant(SymbolSet (of keys/tags), particular *record* mapping tags to Types) : VariantSet <: Set
sigma : set struct-described by Sigma(a:Type, b(a):Type) : SigmaSet <: Set
sealed : particular value of the appropriate(meaning what?) variant : SealedSet <: Set
closure : a -> b : ProcType <: Type
how to deal with open variants like Type? how does dispatch work for type classes?
as lexically-scoped dynamic keys/vars to the currently-known variant (or record for open dispatch such as type classes)?
dynamic vars need to treat dynamic environment as a first class record, dispatching on a particular label
what about the ever-growing symbol set associated with these variants?
maybe these symbols need to be multi-dimensional (namespace) to describe their origins to avoid clashing names/tags
but how many dimensions are needed? one per nested scope?
only type-defining scopes would need to be counted, but could be hard to determine statically
one dimension per thread-splitting context may be better; requires per-thread state for a counter
variants as restricted form of dependent product?
what restrictions? left projection can't be restricted to be a symbol/tag for the following to work
better explains dispatching on the type of a value
example ignoring [un]sealing:
Type = sigma(tag, {... EitherTag = (Type, Type) ...}.tag)
Either = \a b -> sigma(tag, {LeftTag = a , RightTag = b}.tag)
Any = sigma(tag, {... EitherTag = sigma((type1:Type, type2:Type), Either type1 type2) ...}.tag)
case (type-tagged-val : Any) of ... | (Either Int String, value) -> (expr) | ...
becomes:
let inner-dispatch = \(ty1:Type) (ty2:Type) (value:Either ty1 ty2) ->
{...
IntTag = \() (ty2:Type) (value:Either Int ty2) ->
{...
StringTag = \() (value:Either Int String) -> expr,
...}.[sigma-left ty2] (sigma-right ty2) value,
...}.[sigma-left ty1] (sigma-right ty2) value
in let case-table = {...
EitherTag = \sigma((ty1, ty2), (value:Either ty1 ty2)) -> inner-dispatch ty1 ty2 value,
...}
in case-table.[sigma-left type-tagged-val] (sigma-right type-tagged-val)
looks a little sloppy: maybe a special dispatching index operation would make typing easier?
or, figure out simple dependent typing rules
=======================================================================================
predicates/comparisons, constraints, proof witnesses for indexing tuples/arrays/records
=======================================================================================
witness erasure?
index constraint in function type describes an implicit argument (witness term)
the value retrieved from the structure at the index is the proof witness itself
term referring to witness to index a structure simply becomes the implicitly-passed value
forall struct. Access(struct, {3=Int}) => struct -> Int
type of functions taking structs containing an Int at position 3, and returning an Int
forall struct. Access(struct, {index=Int}) => struct -> (index : Nat) -> Int
this one is more general
Access extendable like typeclasses?
erasure produces a function more like: Int -> Int
AccessDomain(struct, Set (of indices), result)
may be nicer to express unpredictable or large contiguous accesses
harder to erase witnesses for these?
Actually, the original Access constraint with a record mapping indices to result types
is sufficient to describe AccessDomain if records can support default values
example: AccessDomain(struct, Nat, Int)
Access(struct, {(_:Nat)=Int}) assuming _=X defines default value X for Nat indices
should records support multiple defaults? different subsets, different mappings
this seems a lot like defining dependent functions...
maybe functions should also support Access?
at this point, might as well just use record notation for all structures?
tuple/array syntax as syntactic sugar? how about tuple/array operations?
array notation/behavior corresponds to 'default value' concept from records
tuple/array splitting and concatenating operations remap keys
so should be distinct from record extension/joining and hiding concepts
this is probably only applicable to records whose keys are from Nat
forall struct. Access(struct, {index=Int}) => struct -> (index : Nat) -> Int
becomes: {index = Int} -> (index : Nat) -> Int
=======================================================================================
for tuples/arrays:
map, filter, fold, zip[With], [un]curry...
present efficiency challenges related to contiguous memory arrangement
uniqueness for efficient initialization?
or maybe smart region allocation to make seemingly quadratic operations linear?
strings (of various encodings), other data blobs:
generalized by bit vectors?
extensible parser for designing appropriate literals (and their translation to bit vectors)
can bit vectors/sets be efficiently generalized by tuples/arrays of single-bit values?
yes, saner and more general: tuples/arrays of (modular?) nats/ints
what are symbols and their namespaces really?
figure out tractable dependent-sum/existential representation for variants
open sums and their handles
example list variant:
letrec List a = List-sigma (tag:ListTag, {nil=(); cons=(a, List a)}.tag)
ListTag = TagSet {nil, cons} -- do tag sets also have a representation? (brain explodes)
List-sigma, List-sigma-opener = ... new-sealed-sigma-pair ...
==============================================================================
modular typing with non-terminating term?
(\f* -> f f) : (f & (f -> g)) -> g
(\f* -> f f) (\f* -> f f) :
(f & (f -> g)) -> g
(f & (f -> g)) -> g
({f} -> {g}) -> g
({f} & (f -> g)) -> {g}
g
forall g. g -- but I had to make intuitive jumps to reach this?
in other words: bottom
0CFA seems capable of indicating this, but it's not necessary
it's syntactically indicated in CPS (no continuation (let alone 'halt') is ever invoked)
let u = \f k -> f f k
in u u halt
figure out convenient staging mechanism for modular typing
syntactically check for non-termination
if terminating, carry out beta reduction at each (non-trivial?) attempt to unify an intersection type?
attempt to type residual term normally
simpler idea: continue reducing all terms until they no longer contain any poly(poisonous)-lambda terms
two contains poison that must be expelled:
two = \f* x -> f (f x) : 'a 'b 'c. (a -> b & b -> c) -> a -> c
could also mark x* if residual types should potentially retain poison; see bottom result
appid = \w g y z -> (w, g y, z) : forall c a b d. c -> (a -> b) -> a -> d -> (c, b, d)
flip = \f x y -> f y x : forall a b c. (a -> b -> c) -> b -> a -> c
flip (flip appid two) two
whatever, just start here:
\k -> (\j -> appid j two) k two
step1 = \j -> appid j two
appid: \w{j?} g{two} y z -> (w, g y, z)
two: \f{y} x -> f (f x) -- y is now poisonous too
appid: \w{j?} g{two} y* z -> (w, g y, z)
step1 : a' b' c' d' e' . e -> (a -> b & b -> c) -> d -> (e, (a -> c), d)
step2 = \k -> step1 k two
step1: \j{k?} -> appid j two
appid: \w{j?} g{two} y{two} z -> (w, g y, z)
two: \f{two} x -> f (f x) -- about to poison x by passing it to two (as f)
two(1): \f{x?} y -> f (f y) -- x is now poisoned
two: \f{two} x* -> f (f x){\f{x?} y -> f (f y)} -- note the mark of poison
two: \f{\f{x?} z -> f (f y)} z -> f (f z) -- z is never used as f, so it remains clean
two(1): \f{x?} y{z} -> f (f y)
two: \f{\f{x?} z -> f (f y)} z -> f (f z){\f{x?} y{z} -> f (f y)}
two(1): \f{x?} y{\f{x?} y{z} -> f (f y)} -> f (f y)
two(2): \z -> (\f{x?} y{\f{x?} y{z} -> f (f y)} -> f (f y))
two(3): \f{two} x* -> (f (f x)){\z -> (\f{x?} y{\f{x?} y{z} -> f (f y)} -> f (f y))} -- if we inline sharing notation here: \x z -> x (x (x (x z)))
appid(1): \w{j?} g{two} y{two} z -> (w, (g y){\f{two} x* -> (f (f x)){\z -> (\f{x?} y{\f{x?} y{z} -> f (f y)} -> f (f y))}, z) -- call it bleh
step1: \j{k?} -> (appid j two){bleh}
step2: \k -> -- whatever, i was inconsistent with my notation, so this won't make sense
skip this block and read block below it first:
anyway, if i wasn't a moron, this should be equivalent to:
\k -> (\j -> appid-new j k)
appid-new = \w z -> (w, (\f x -> f (f (f (f x)))), z) : forall a b. a -> b -> (a, (forall c. (c -> c) -> c -> c), b)
note loss of poison/modular-polymorphism; sometimes this isn't desired? see above
if marking x* in two originally, you should get:
appid-new = \w z -> (w, (\f* x* -> f (f (f (f x)))), z) : forall a b. a -> b -> (a, (f' g' c' d' e'. (f -> g & g -> c & c -> d & d -> e) -> f -> e), b) -- hooray poison
aside from mark for assured polymorphism, a mark that only activates polymorphism when it's shallowly apparent might be good
\f* x*? -> f (f x) -- there is no shallowly apparent polymorphic use of x, so treat it monomorphically
the point is to allow the programmer to later change the body such that x is used polymorphically, without having to change marking
actually, i was wrong, this example still does yield polymorphism in x: \x* z -> x (x (x (x z)))
x's use as f poisons it during reduction
ML^F: how to type the following? some non-principal example types:
\f a b -> (f a, f b)
forall s r. (s -> r) -> s -> s -> (r, r)
forall a b r. (forall s. (s -> r)) -> a -> b -> (r, r)
forall a b. (forall s. (s -> [s])) -> a -> b -> ([a], [b])
this doesn't work:
forall a b rr (ra :> rr) (rb :> rr) (f :> (forall s (r :> rr). s -> r)). f -> a -> b -> (ra, rb)
I think the disclaimer is that ML^F can't infer an all-encompassing type here because f is used polymorphically? (requiring a type annotation)
two = \f x -> f (f x)
(a -> b & b -> c) -> a -> c
two two
[(a -> b & b -> c)] -> a -> c
[(a -> b & b -> c) -> (a -> c)]
(a -> b & b -> c) -> (a -> c) = (a1 -> b1 & b1 -> c1) -> (a1 -> c1) & (a2 -> b2 & b2 -> c2) -> (a2 -> c2)
(a -> b & b -> c)
(a1 -> b1 & b1 -> c1) -> (a1 -> c1)
(a2 -> b2 & b2 -> c2) -> (a2 -> c2)
(a -> b)
(a1 -> b1 & b1 -> c1) -> (a1 -> c1)
(b -> c)
(a2 -> b2 & b2 -> c2) -> (a2 -> c2)
a
(a1 -> b1 & b1 -> c1)
b
(a2 -> b2 & b2 -> c2)
(a1 -> c1)
(a1 -> c1) = (a3 -> c3) & (a4 -> c4)
b *
(a2 -> b2) & (b2 -> c2)
(a3 -> c3) & (a4 -> c4)
c
(a2 -> c2)
======
a
(a1 -> b1 & b1 -> c1)
b = (a1 -> c1) = (a3 -> c3) & (c3 -> c4)
c
(a3 -> c4)
a -> c :=
a -> a3 -> c4
==============================================================================
flow example without first normalizing: copied 'two' before self-applying
(\f x -> f (f x)) (\g a -> g (g a))
f= (\g a -> g (g a))
(\x -> f (f x) {f = (\g a -> g (g a))})
@x=x
g= x
--f: x -> (\a -> g (g a) {g = x})
g= (\a -> g (g a) {g = x}) -- should g closure also be recursive? though it wouldn't matter for this example
--f: (\a -> g (g a) {g = x}) -> (\a -> g (g a) {g = x, g = (\a -> g (g a) {g = x})})
(\a -> g (g a) {g = x, g = (\a -> g (g a) {g = x})})
@a=a
x: a -> b*
x: b -> c*
c
&
x: a -> b
x: b -> c
x: c -> d*
x: d -> e*
e
(a -> b & b -> c & c -> d & d -> e) -> a -> (c | e) -- result union due to closure conflation of g
==============================================================================
flow example without first normalizing: shared 'two' self-applied
(\two -> two two) (\f x -> f (f x))
two= (\f x -> f (f x))
(two two {two = (\f x -> f (f x))})
f= (\f x -> f (f x))
(\x -> f (f x) {f = (\f x -> f (f x))})
@x=x1
f= x1
f= (\x -> f (f x) {f = (\f x -> f (f x)), f = x1})
(\x -> f (f x) {f = (\f x -> f (f x)), f = x1, f = (\x -> f (f x) {f})})
@x=x2
f= x2
(\x -> f (f x) {f = (\f x -> f (f x)), f = x1, f = (\x -> f (f x) {f}), f = x2}) -- same state except for the injected x's (could go on forever if they count as differences)
&
x1: x2 -> a*
x1: a -> b*
b
&
x= x2
repetitive garbage...
&
x= x2 -- again
x1: x2 -> a
x1: a -> b
b
x1: b -> c
x1: c -> d
d
(\x -> f (f x) {f = (\f x -> f (f x)), f = x1, f = (\x -> f (f x) {f}), f = x2}) : final = mu self. (x2 -> a & a -> b & b -> c & c -> d) -> self & x2 -> (self | b | d)
(x2 -> a & a -> b & b -> c & c -> d) -> x2 -> (final | b | d) -- can final be eliminated? it seems to be a pointless unfolding/bottom
==============================================================================
Symbols as fundamental data atoms
hierarchical structure providing namespacing and contextualization
when unambiguous, less information is necessary to represent a symbol
data tags are symbols
[in]finite sum types
constructors are namespaced symbols
each such type acts as a namespace for its constructor symbols
a data node only needs to be tagged with enough symbolic info to disambiguate
ex: naturals as inifinite sum types: symbols interpreted within a Nat context
symbols are associated with textual representations/names for readability
AGDA-like universe polymorphism: data types available at any Set-level; Set is parameterized by Nat
multidimensional arrows for parallelism with possibly non-deterministic effects
dependency tree: nodes point to dependents, store dependency arity, and their completion effect
linear ordering for simple and positional application
name map for keyword application
existential typing question? it normally requires rank-2 universal quantifiers
((a -> b) -> a -> result) -> Obj b -> result
typecasing as a constraint
forces type information to flow into functions
also allows you to see which functions inspect type; we get parametricity back?
data representation for various stages of interpretation/compilation
how will runtime code generation be supported?
will modules store data needed to compile procedures it stores?
maybe the procedure objects themselves should store this data repr
erasable when unused; see value erasure somewhere below
system-level procedures for simple c-like translation? no gc; no auto-parallelism (w/ threads; optimal reordering is still fine)
general procedures with module-like linking behavior with parallelism available in dependency order for sub-computations?
implicit union and intersection constructors
are intersection constructors the same as record constructors? (and so unions are variants?)
type-inference algorithm
do these dependently-typed terms have principal typings? they don't need to; use CFA to deal with them
can probably get away with it via syntactic embellishments or type annotations
dependent-typing functions can use (possibly run-time) partial evaluation / (JIT) abstract interpretation
ex: sprintf :: (x::String) -> SPrintfType x
procedure calls as module linking:
modules have two possible linking interfaces with corresponding procedure application analogies/protocols:
name-based inputs (call w/ keyword args)
position-based translation to input names (call w/ positional args, allowing possible skipping of positions)
eval-within/proceed form asks for a given expression to be evaluated within a particular module
if the module is complete, the result is the evaluation
if the module is incomplete, the result is a "proceedure", though position-based calling is an orthogonal feature
as the procedure is partially applied/linked, the results are also procedures
once fully applied, the result is the evaluation
procedure application translation:
eval-within/proceed with the expression body and desired procedure module
link args w/ procedure module via one of the above protocols
value of the expression is produced if linking was complete; a partially-linked module is produced otherwise
stack model:
[...cont...[inputs]] - push args, then enter module computation
[...cont...|[inputs][output space?]...locals...] - reserve space for results?, perform local computation
[...cont...[former inputs?][outputs]] - continue with output results
how to reconcile stack model of mutually-recursive linking?
incorporate fully-parallelizeable procedures:
expose outer and inner evaluation parallelism (argument evaluation vs. body evaluation)
synchronization points to control parallelism and allow side-effects
fundamental syntax: apply-parallel and apply-serial
evaluation under lambda: partially applied procedures; when is it appropriate? (strict vs. lazy argument returns)
reconcile complexities in coordination of application site provisions with procedure body dependencies
value-level (semi?)unification:
why should the type-level have all the fun?
procedure-app linking coordinator can reach upwards from pattern argument into constructor computation to pull out
the true dependencies asynchronously to avoid having to wait for an artificial synchronization due to constructor app
data repr:
data types from universes as in Conor McBride's ornaments paper
records/modules
structs/tuples
arrays - ephemeral dynamic arrays via linear types
inductive/coinductive algebraic type layer
pointed/cyclic?/lifted/mutable/linear/shared
memoization interface
are thunks a special case?
memoized functions are different from thunks, because forcing a thunk does not syntactically involve application to ()
is explicit support for memoized functions actually necessary? does a finite map paired with a function suffice?
can every instance be conveniently expressed as dynamic programming? possibly not
codata field slots should be non-strict, but what is a convenient way to specify whether to use memoization?
memoization can save computation, but could also cause space leaks, so it's good to have a choice
automatic translation of nat/int-like structurally recursive values to machine-level binary number representation
infinite sum types? seems plausible using same idea as constructor tagging of pointers
unary recursive structures? S S S S S Z -> 0b101
n-ary recursive structures?
or maybe the recursive structures should be view types of the machine-level repr?
records:
the name 'record' describe a type interface, not an actual data representation
the closest thing to a canonical record value is probably a lexical environment
data reprs with record interfaces include:
environments, which are stacks of sub-records describing lexical frames; also supports de Bruijn-like addressing
data reprs that define instances of the projection typeclass
form of subtyping based on row-types, as in Morrow
free extension without overwriting duplicate fields -- lexically scoped labels
extension becomes like environment extension
record field access -- constant-time lookup via implicitly-passed label->index args
environments as records: variable lookup as field lookup
closure and arguments as records
procedure types can include parameter names (symbols):
f : (one : A) -> (two : B) -> ...
f one two = ...
partially/fully apply procedure to arguments by name or position, even with non-flat procedures:
call procedure with a record supplying arguments
f : (one : A) -> (two : B) -> (three : C) -> ...
by name:
(f :@ {two=...}) : (one : A) -> (three : C) -> ...
or by position
(f :@ {2=..., 3=...}) : (one : A) -> ...
proc could be non-flat; closure would simply store pending name/position fields until applicable
the type says eventually there are named params: two and three
f : (one : A) -> (two : B) -> (three : C) -> ...
f one = ... computation here ... eventually return (proc two three = ...)
essentially:
(f :@ {three=val}) ==> proc one two = f one two val
though maybe not implemented this way; with enough info, could be smarter and possibly finish computations
that only depend on three early, even within the nested proc
per-argument partial-application-handling code wrapped with closures to do this?
if names and positions are mixed, positions should take precedence
modules: (how much of this is compatible with the above records?)
basic module form that does no special parsing or special syntactic expansion with macros defined within it
a text/file-based module loader that is given a parser and can use expanders as they're defined
there could also be an intermediate loader for pre-parsed concrete syntax trees
syntactic expansion:
involves a renaming of public symbols (which could be used by many modules) to private symbols unique to that module
syntactic closures effectively close over private symbols to avoid symbol mis-captures
syntactic closures expanding to external dependencies
cause implicit linking of private names? public names still not available for use, preserving encapsulation
macros considered dependent on final module env:
they normally won't be available for export until all of a module's dependencies are satisfied
this also means values that store macros (ie., building a list of macros) are also still unevaluated
the final env can be forced complete early, producing a record with the macros satisfied; it doesn't mutate the module
of course will also satisfy dependencies of those values that store macros, so they will also be present
undefine as an interactive module-builder gimick:
create a new module and load current definitions into it, minus the symbol being undefined, and continue there
these definitions must not close over the previous module, to avoid capturing references to the old def
define form creates a binding within the enclosing [sub]-module's top-level environment
since these bindings are all placed in the same environment without extension, they are mutually-recursive
a module stores distinct SCCs of mutually recursive procedures/thunks, noting remaining dependencies for each
as definitions are evaluated, if they satisfy an existing SCC's dependencies, the SCC is relinked
this linking is asymmetric:
the dependent SCCs have their types updated
types of the dependencies don't depend on how they are used (more generally, types only depend on what they use)
though a separate use-based linking could be done to determine proc specializations
such dependency satisfaction may cause previously distinct SCCs to be coalesced (they're shown to be mutually recursive)
when an evaluation is actually taking place, its dependencies are calculated and extracted from the module environment
unsatisfied dependencies in the current module env cause it to be rebuilt, incorporating any additional defs seen
if the dependencies still aren't there, it indicates an 'undefined dependency' failure
unresolve dependencies: should such computations be saved up until linking?
cyclic dependencies without indirection (ie., thunks) are an error
would be nice, and wrt side-effects, those should be recorded in the order in which their dependencies are resolved
symmetric linking deterministically satisfies the side effects of one module before the other
also need a way to order incoming dependency symbols to allow proper ordering of side effects as they are satisfied
ex: should print out "hello, goodbye"
print msg
print "hello, "
msg = "goodbye"
then the side-effects should actually occur whenever the module is completed via linking, in the same order
or, maybe discharge of side-effects can be a separate operation after the linking is complete
or even anytime during partial linking for partially-completed side effects?
in this case, maybe such discharge should even be repeatable?
pre-procedures are open terms defined in a module where the module-level bindings are free
procedures can be created on demand during module eval by providing pre-procs with the most up-to-date module context
modules are immutable, so this context changes as linking occurs to reflect the newly-formed module
finally, evaluating a module produces a record reflecting its top-level environment
originally was thinking it was ok to project on any complete fields of a partially linked module
now I think this might be a bad idea; it's also a simpler model to only consider this for completed modules
asymmetric directed linking: one module's provisions are used to satisfy another's requirements
symmetric mutually-recursive linking: combine requirements and provisions, but resolving mutually satisfied requirements
mixins/traits become easy
procedure app explainable via very simple directed linking? allows targeted parameter applicaton? (ie. skipping of params)
dynamic 'opening' of modules: should specific symbols need to be specified?
reflection/introspection/inspection of exposed fields; debugger/authority able to inspect private environment
environments:
environments are records:
normally immutable, except in the case of module evaluation, during the evaluation itself?
the residual record produced by the module evaluation should be immutable
get-environment produces the current procedure environment as an immutable record:
since a module's environment changes as the module is evaluated, define a get-namespace for grabbing a ref to it
what about get-environment within a procedure defined in a module currently being evaluated? that will still break
but such procedures can be considered different each time a new definition is linked to its parent module
this "evolving" model can be implemented with lazy updates, only performing them when a procedure is demanded
or, update could be performed when a procedure's dependencies are linked/updated
another possibility: consider premature get-environment uses to be referencing undefined vars (in a way it does, right?)
or the module environment can be considered the last definition of a module, so premature refs to it are undefined?
or maybe implicit internal env access is a bad idea, and envs should only be accessible given a complete module in hand
alternative module/environment model:
rejecting this idea for now; convenience of the "evolving" model seems to outweigh its complexity
instead of insisting on modules essentially bootstrapping themselves into existence, consider them constructed externally
though this will prevent the definition of macros alongside with their uses (this is something like phase-separation)
separate issue: would be nice to be able to update module definitions and have their dependents re-link themselves
doing this causes fissure with running code in the old module context; after all, module update should not be mutation
let, let-rec, let-seq via record construction and use/open:
use ({x=...} :+: {y=...})
use {x=..., y=...}
use {x=...}; use {y=...}
or not? instead build a sub-module whose top-level defs are mutually recursive
type classes:
static/dynamic dispatch based on records: 'case' involves destructor/continuation-record fields labeled with type/data tags
ranging over values too; not just types
allow projections, something like: ProjClass Type symbol
class constraints a form of type function? based on records (see Functor example below)
data-polymorphic views: streams vs. lists, type class interaction with case
closed classes that allow overlapping instances (otherwise overlapping can produce undefined results)
named instances
explicit export/hide interface
implicit uses of methods search module context layer of original proc definition first, calling context last
which justifies the type simplification during instance linking in the below example
implicit uses should also be sealable, allowing only instances that exist in the context at the time of the sealing
typing example:
term: (map f xs : result) ; where
f : (x -> y)
xs : [x]
result : [y]
map is free
local constraint: map : (x -> y) -> [x] -> [y]
Functor class: map : (Functor f) => (a -> b) -> f a -> f b
after linking with class:
(map f xs : result) ; where
map : (Functor []) => (x -> y) -> [x] -> [y]
if a (Functor []) instance is available unhidden, link it too:
map : (x -> y) -> [x] -> [y]
otherwise, the constraint propagates upward as an implicit argument
named instances can be used directly like records:
Functor [] ; this is a type
get-instance (Functor []) ; something like this... is a value
F : Functor [] ## which is synonymous with ## F : {map : (x -> y) -> [x] -> [y]}
(F.map f xs : result) ; or use map in an open/use/with statement under F
case analysis:
not just a static syntax compilation into nested switches
pattern matching as selecting the field of a record containing destructors/continuations
able to dynamically build efficient record-based matching as dependencies are satisfied
implicits:
based on dynamically scoped parameters/keys (also see racket's initial values and preserves idea)
open/closed contexts for accepting outside implicits (maybe all dynamic params too, not just implicits?)
allow reinterpretation of type classes; eg. sorting w/ implicit Ord definition override
algebraic data:
reified types as values (these are NOT data constructor tags)
if a type contains value-level details, some of those details may be erasable at run-time
ex. if we know that a particular Maybe value is always a Just, we can represent the value without a tag at run-time
this is not limited to tags: if we know that a particular value is always [1, 2, 3], that value may be erasable as well
of course, erasure should not be observable
tags (these are not types, see above)
constructors
destructors
data declarations typically produce some names of the form Tag; one for each variant
Tag as a record : {tag : TypeTag, ## whatever the type of tags is
constructor : stuff -> |Tag stuff|, ## result is a refined type
destructor : |Tag stuff| -> (stuff -> result) -> result}
Tag will be applicable directly (something like a typeclass Apply?) which will use its constructor
type tower:
allow values to be raised to type-level computations, and vice-versa; generalizes upwards (kinds, etc.)
how to fit in propositions?
intersections/unions/subtyping/recursive types - avoid universal/existential quantification
principal typing property: is it available under these circumstances?
R-Acyclic Semi-unification
dynamic checking = type checking occurs in tandem with evaluation (one step ahead?)
recursive types due to mutually-recursive module linking still needs consideration
type families: consider procedure that dynamically produces new types of the same form
sub-structural typing and delimited continuations: shift/reset, control/prompt, spawn, abort, etc.
concrete syntax:
context-sensitive syntactic values bound to variables (such as constructors); or should it be values, not variables?
above explainable via syntactic records with context tags as field labels?
partial macro expansion
define expansion only of some macro forms within an expression, either guarding entry, or giving permission